Video: RDD yog dab tsi hauv Scala?
2024 Tus sau: Lynn Donovan | [email protected]. Kawg hloov kho: 2023-12-15 23:47
Resilient Distributed Datasets ( RDD ) yog cov ntaub ntawv tseem ceeb ntawm Spark. Nws yog ib qho immutable distribution collection of objects. RDDs tuaj yeem muaj txhua hom Python, Java, lossis Scala cov khoom, suav nrog cov chav kawm uas tus neeg siv tau teev tseg. Formally, ib RDD yog nyeem nkaus xwb, muab faib sau cov ntaub ntawv.
Cov lus nug tseem yog, qhov txawv ntawm RDD thiab DataFrame yog dab tsi?
RDD – RDD yog ib qho kev sib sau ua ke ntawm cov ntaub ntawv nthuav dav thoob plaws ntau lub tshuab hauv pawg. RDDs yog ib txheej ntawm Java lossis Scala cov khoom sawv cev rau cov ntaub ntawv. DataFrame -A DataFrame yog ib qho kev faib sau los ntawm cov ntaub ntawv muab tso rau hauv kab npe. Nws yog conceptually sib npaug rau ib lub rooj hauv a kev txheeb ze database.
Tsis tas li ntawd, RDD faib li cas? Resilient Muab faib Cov ntaub ntawv ( RDDs ) Lawv yog a faib sau cov khoom, uas yog khaws cia rau hauv lub cim xeeb los yog nyob rau hauv disks ntawm txawv tshuab ntawm ib pawg. Ib leeg RDD tuaj yeem muab faib ua ntau qhov kev sib cais kom cov partitions tuaj yeem khaws thiab ua tiav ntawm cov tshuab sib txawv ntawm ib pawg.
RDD ua haujlwm li cas?
RDDs hauv Teeb muaj sau cov ntaub ntawv uas muaj partitions. RDDs hauv Teeb tau muab faib ua cov ntsiab lus me me ntawm cov ntaub ntawv - lub npe hu ua partitions, thaum ua tiav, ib txoj haujlwm yuav raug tso tawm rau ib qho kev faib tawm. Partitions hauv RDDs yog lub hauv paus units ntawm parallelism.
Qhov twg yog RDD sai dua lossis DataFrame?
RDD - Thaum ua haujlwm yooj yim pab pawg thiab kev sib sau ua ke RDD API qeeb dua. DataFrame - Hauv kev tshawb nrhiav kev tshawb fawb, tsim cov ntaub ntawv sib sau ua ke ntawm cov ntaub ntawv, dataframes yog sai dua . RDD - Thaum koj xav tau kev hloov pauv qis thiab ua haujlwm, peb siv RDDs . Tsis tas li ntawd, thaum peb xav tau cov ntsiab lus siab siab peb siv RDDs.
Pom zoo:
SBT qhov project hauv Scala yog dab tsi?
Sbt yog qhov qhib-qhov cuab yeej tsim rau Scala thiab Java tej yaam num, zoo ib yam li Java's Maven thiab Ant. Nws cov yam ntxwv tseem ceeb yog: Kev txhawb nqa haiv neeg rau kev sau cov lej Scala thiab kev koom ua ke nrog ntau Scala cov qauv kev xeem. Nruam muab tso ua ke, sim, thiab xa mus
Dab tsi yog cov neeg ua yeeb yam hauv Scala?
Scala tus thawj concurrency construct yog lam. Cov neeg ua yeeb yam yog cov txheej txheem sib thooj uas sib txuas lus los ntawm kev sib pauv lus. Cov neeg ua yeeb yam tseem tuaj yeem pom tau tias yog ib qho ntawm cov khoom siv uas ua rau kev hu xov tooj rau kev xa xov
DataFrame hauv spark Scala yog dab tsi?
Lub Spark DataFrame yog ib qho kev sib sau ntawm cov ntaub ntawv sib sau ua ke rau hauv cov npe uas muab kev ua haujlwm los lim, pab pawg, lossis suav cov sib sau ua ke, thiab tuaj yeem siv nrog Spark SQL. DataFrames tuaj yeem tsim los ntawm cov ntaub ntawv tsim qauv, RDDs uas twb muaj lawm, cov ntxhuav hauv Hive, lossis cov ntaub ntawv sab nraud
Dab tsi yog override hauv Scala?
Scala Method Overriding. Thaum ib tug subclass muaj tib lub npe txoj kev raws li tau teev tseg nyob rau hauv cov niam txiv chav, nws yog hu ua txoj kev overriding. Thaum subclass xav muab ib qho kev siv tshwj xeeb rau cov qauv uas tau teev tseg hauv chav kawm niam txiv, nws overrides txoj kev los ntawm niam txiv chav kawm
Dab tsi yog chav kawm implicit hauv Scala?
Scala 2.10 tau qhia txog qhov tshiab hu ua cov chav kawm implicit. Ib chav kawm implicit yog ib chav kawm uas cim nrog cov lus tseem ceeb. Lo lus tseem ceeb no ua rau cov chav kawm thawj tus tsim muaj rau kev hloov pauv tsis raug thaum chav kawm nyob rau hauv qhov. Cov chav kawm Implicit tau npaj rau hauv SIP-13