Publication detail

Utilization of databases with missing data for classification of the EU regions

ODEHNAL, J. NEUBAUER J. MICHÁLEK, J.

Czech title

Použití neúplných datových souborů ke klasifikaci regionů EU

English title

Utilization of databases with missing data for classification of the EU regions

Type

journal article - other

Language

cs

Original abstract

Použití neúplných datových souborů ke klasifikaci regionů EU. Empirická analýza.

Czech abstract

Použití neúplných datových souborů ke klasifikaci regionů EU. Empirická analýza.

English abstract

The paper deals with the clustering of 202 European NUTS 2 regions into groups with similar values of 22 economic variables. Data were obtained from the Eurostat Regional Yearbook 2007 and from the database Regional Statistics and they contain high number of missing values. The data analysis is primarily focused on filling missing values. Three methods for filling missing values were used and compared: filling by average, by median and by ZET algorithm described in [22]. The results of clustering are described by tables and by dendrogram. Further the comparison of the classification results with regard to the method of handling with missing data was performed. The conclusion is that the ZET algorithm is the suitable statistical technique for filling missing data in cosidered data files.

Keywords in Czech

chbějící data, ZET algoritmus, konkurenceschopnost, klasifikace regionů EU

Keywords in English

missing data, ZET algorithm, competitiveness, NUTS classification of EU regions

RIV year

2009

Released

04.05.2009

Publisher

Český statistický úřad

Location

Praha

ISSN

0322-788X

Volume

2009

Number

5

Pages from–to

446–461

Pages count

16

BIBTEX


@article{BUT48217,
  author="Jaroslav {Michálek},
  title="Použití neúplných datových souborů ke klasifikaci regionů EU",
  year="2009",
  volume="2009",
  number="5",
  month="May",
  pages="446--461",
  publisher="Český statistický úřad",
  address="Praha",
  issn="0322-788X"
}