Video: Dab tsi yog Impala hauv cov ntaub ntawv loj?
2024 Tus sau: Lynn Donovan | [email protected]. Kawg hloov kho: 2023-12-15 23:47
Impala yog qhov chaw qhib loj heev thaum uas tig mus ua cov lus nug cav nyob rau sab saum toj ntawm cov kab ke xws li Apache Hadoop. Nws tau tsim raws li Google's Dremel ntawv. Nws yog kev sib tham sib SQL zoo li cov lus nug cav uas khiav saum Hadoop Distributed File System (HDFS). Impala siv HDFS raws li nws lub hauv paus cia.
Hais txog qhov no, Impala thiab Hive yog dab tsi?
Apache Hive yog tus qauv zoo rau SQL-in-Hadoop. Impala yog qhov qhib SQL lus nug cav tsim tom qab Google Dremel. Cloudera Impala yog lub cav SQL rau kev ua cov ntaub ntawv khaws cia hauv HBase thiab HDFS. Impala siv Hive megastore thiab tuaj yeem nug cov Hive rooj ncaj qha.
Tsis tas li ntawd, qhov twg zoo dua Hive lossis Impala? Apache Hive tej zaum yuav tsis zoo tagnrho rau kev sib tham sib xam qhov twg Impala yog txhais tau rau kev sib tham sib xam. Hive yog batch raws li Hadoop MapReduce whereas Impala yog ntau zoo li MPP database. Hive txhawb hom complex tab sis Impala tsis. Apache Hive yog fault tolerant whereas Impala tsis txhawb kev zam txim.
Kuj nug, vim li cas peb thiaj siv Impala?
Impala txhawb nqa hauv-nco cov ntaub ntawv ua, piv txwv li, nws nkag mus / tshuaj xyuas cov ntaub ntawv uas yog khaws cia ntawm Hadoop cov ntaub ntawv nodes yam tsis muaj cov ntaub ntawv txav. Koj ua tau nkag tau cov ntaub ntawv siv Impala siv SQL-zoo li cov lus nug. Impala muab kev nkag tau sai rau cov ntaub ntawv hauv HDFS thaum piv rau lwm lub cav SQL.
Dab tsi yog Hive hauv cov ntaub ntawv loj?
Apache Hive yog a cov ntaub ntawv warehouse system rau cov ntaub ntawv sumarization thiab tsom xam thiab querying loj cov ntaub ntawv systems nyob rau hauv qhib-qhov chaw Hadoop platform. Nws hloov cov lus nug zoo li SQL rau hauv MapReduce cov haujlwm kom ua tau yooj yim thiab ua tiav ntawm cov ntim loj heev. cov ntaub ntawv.
Pom zoo:
Dab tsi yog cov ntaub ntawv hloov pauv hauv cov ntaub ntawv khaws cia?
Cov ntaub ntawv hloov pauv yog cov ntaub ntawv uas tau tsim nyob rau hauv daim ntawv thov kev sib ntsib, uas tsis tau khaws cia hauv cov ntaub ntawv tom qab daim ntawv thov raug txiav
Dab tsi yog kev noj cov ntaub ntawv hauv cov ntaub ntawv loj?
Cov ntaub ntawv ingestion yog tus txheej txheem ntawm tau txais thiab import cov ntaub ntawv rau tam sim ntawd siv los yog cia nyob rau hauv ib tug database. Txhawm rau noj ib yam dab tsi yog 'coj ib yam dab tsi los yog nqus ib yam dab tsi.' Cov ntaub ntawv tuaj yeem xa tawm hauv lub sijhawm tiag tiag lossis noj hauv cov khoom siv
Vim li cas kem taw qhia cov ntaub ntawv khaws cia ua cov ntaub ntawv nkag ntawm disks sai dua li kab qhia cov ntaub ntawv khaws cia?
Kem oriented databases (aka columnar databases) yog qhov tsim nyog rau kev ntsuas kev ua haujlwm ntau dua vim tias cov ntaub ntawv hom ntawv (kem hom) qiv nws tus kheej kom nrawm dua cov lus nug ua - scans, aggregation thiab lwm yam. Ntawm qhov tod tes, kab oriented databases khaws ib kab (thiab tag nrho nws. kab) contiguously
Dab tsi yog cov teeb meem ntawm kev tswj cov ntaub ntawv hauv cov ntaub ntawv ib txwm muaj?
Nyob rau tib lub sijhawm, qhov kev tswj hwm cov ntaub ntawv ib puag ncig tsim teeb meem xws li cov ntaub ntawv rov ua dua thiab tsis sib xws, cov kev pab cuam-cov ntaub ntawv dependence, inflexibility, tsis muaj kev ruaj ntseg, thiab tsis muaj cov ntaub ntawv sib qhia thiab muaj
Txoj hauv kev zoo tshaj plaws rau daim ntawv thov upload cov ntaub ntawv loj hauv s3 yog dab tsi?
Cov ntaub ntawv loj tshaj plaws uas tuaj yeem muab tso rau hauv Amazon S3 thoob hauv ib qho kev ua haujlwm PUT yog 5 GB. Yog tias koj xav upload cov khoom loj (> 5 GB), koj yuav txiav txim siab siv multipart upload API, uas tso cai rau upload cov khoom ntawm 5 MB mus txog 5 TB