--- title: Project work submission tags: report description: Introductoin to Big Data Science. --- # Project work submission ## Team - Takuya Sukegawa (s1260220) - Shihomi Hashimoto (m5251107) - Aoshi Suzuki (s1260241) ## Purpose Implement Hadoop/Spark-based Kmeans/Kmeans++ algorithms in PAMI. ## Source - kmeans.py - answer.txt - README.md (This file) ## Run ``` python3 kmeans.py ``` ## Result - The row and its cluster ID number <br>  - Cluster centers <br>  ## Reference - https://spark.apache.org/docs/latest/ml-clustering.html (MLib) - https://spark.apache.org/docs/latest/mllib-clustering.html (using RDD) - https://rsandstroem.github.io/sparkkmeans.html - https://github.com/seraogianluca/k-means-mapreduce - https://blog.imind.jp/entry/2019/09/14/141742 - https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.ml.clustering.KMeans.html
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up