I'm computing connected components using Spark GraphX on AWS EC2. I believe the computation was successful, as I saw the type information of the final result. However, it looks like Spark was doing some cleanup. The BlockManager removed a bunch of blocks and stuck at
15/07/04 21:53:06 INFO storage.BlockManager: Removing block rdd_334_4
15/07/04 21:53:06 INFO storage.MemoryStore: Block rdd_334_4 of size 25986936 dropped from memory (free 15648106262)
There was no error message, no update for like an hour. If I press the Enter key, I got disconnected from the cluster. Does anyone happen to know what's going on here?
I used 8 r3.4xlarge instances. I have 7 million edges and 200 million vertices.
Thank you!
Aucun commentaire:
Enregistrer un commentaire