samedi 4 juillet 2015

Spark got stuck with BlockManager when computing connected components using GraphX

I'm computing connected components using Spark GraphX on AWS EC2. I believe the computation was successful, as I saw the type information of the final result. However, it looks like Spark was doing some cleanup. The BlockManager removed a bunch of blocks and stuck at

15/07/04 21:53:06 INFO storage.BlockManager: Removing block rdd_334_4

15/07/04 21:53:06 INFO storage.MemoryStore: Block rdd_334_4 of size 25986936 dropped from memory (free 15648106262)

There was no error message, no update for like an hour. If I press the Enter key, I got disconnected from the cluster. Does anyone happen to know what's going on here?

I used 8 r3.4xlarge instances. I have 7 million edges and 200 million vertices.

Thank you!




Aucun commentaire:

Enregistrer un commentaire