Hom ntaub ntawv twg ntawm Hadoop tso cai rau columnar cov ntaub ntawv cia hom?
Hom ntaub ntawv twg ntawm Hadoop tso cai rau columnar cov ntaub ntawv cia hom?

Video: Hom ntaub ntawv twg ntawm Hadoop tso cai rau columnar cov ntaub ntawv cia hom?

Video: Hom ntaub ntawv twg ntawm Hadoop tso cai rau columnar cov ntaub ntawv cia hom?
Video: Data Science with Python! Analyzing File Types from Avro to Stata 2024, Plaub Hlis Ntuj
Anonim

Kab ntawv cov ntaub ntawv (Parquet, RCFile )

Qhov tseeb hotness nyob rau hauv cov ntaub ntawv tawm tswv yim rau Hadoop iscolumnar cov ntaub ntawv cia. Yeej qhov no txhais tau hais tias tsis yog cia li khaws cov kab ntawm cov ntaub ntawv nyob ib sab rau ib leeg koj kuj khaws cov kab ntawv nyob ib sab. Yog li datasets tau muab faib ua ob qho tib si horizontally thiab vertically.

Ib sab ntawm no, hauv hom ntawv twg Hadoop lis cov ntaub ntawv?

Muaj ob peb Hadoop - cov ntaub ntawv tshwj xeeb hom ntawv uas tau tsim tshwj xeeb los ua haujlwm zoo nrogMapReduce. Cov no Hadoop - cov ntaub ntawv tshwj xeeb hom ntawv includefile-based cov ntaub ntawv cov qauv xws li cov ntaub ntawv sib txuas, serialization hom ntawv zoo li Avro, thiab columnar hom ntawv xws li RCFile thiab Parquet.

Ib tug kuj yuav nug, dab tsi yog columnar file format? Kab thiab Kum Cia Rau Hive. ORC yog ib columnar khaws cia hom ntawv siv hauv Hadoop rau Hivetables. Nws yog qhov ua tau zoo hom ntaub ntawv rau khaws cov ntaub ntawv uas cov ntaub ntawv muaj ntau kab. Ib qho piv txwv yog Clickstream (web) cov ntaub ntawv los txheeb xyuas lub vev xaib kev ua haujlwm thiab kev ua haujlwm.

Ib yam li ntawd, nws yog nug, dab tsi yog hom ntaub ntawv hauv Hadoop?

Basic cov ntaub ntawv tawm tswv yim yog: Text hom ntawv , Ntsiab-Vim hom ntawv , Sib hom ntawv . Lwm yam hom ntawv uas tau siv thiab paub zoo yog: Avro, Parquet, RC lossis Kab-Kab hom ntawv , ORC lossis Optimized RowColumnar hom ntawv.

Vim li cas columnar file formats siv nyob rau hauv cov ntaub ntawv warehousing?

ORC khw muag khoom kab cov ntaub ntawv hauv columnar hom ntawv . No kab- columnar hom ntawv yog heev npaum rau compressionand khaws cia . Nws tso cai rau kev ua haujlwm sib luag thoob plaws acluster, thiab cov columnar hom ntawv tso cai rau hla ntawm cov kab uas tsis xav tau rau kev ua haujlwm sai thiab decompression.

Pom zoo: