aws | HaVeDa - A Backend Development Blog

Out of the Middle Ages: Use S3a File System with Spark (2.x), Hadoop (2.7.x) and AWS SDK (>1.7.4)

August 13, 2017August 15, 2017 ywilkof1 Comment

Much had been said about the hardships entailed in combining of Apache Spark, Hadoop libraries and Amazon’s AWS SDK. Take as an example reading from S3 Storage using s3a:// file system. If you’ve tried once this setup, then you know it is not a straight forward task. While it seems like this should work out…