Java SparkSql 2.4.0 ArrayIndexOutOfBoundsException error
I am new to Spark and trying to read a CSV file using Java maven project but getting ArrayIndexOutOfBoundsException error.
Dependencies:
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>2.4.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>2.4.0</version>
</dependency>
</dependencies>
Code:
SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL Example")
.config("spark.master", "local")
.getOrCreate();
//Read file
Dataset<Row> df = spark.read()
.format("csv")
.load("test.csv");
CSV File
name,code
A,1
B,3
C,5
Here is stacktrace
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/11/12 11:16:25 INFO SparkContext: Running Spark version 2.4.0
18/11/12 11:16:25 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/11/12 11:16:25 INFO SparkContext: Submitted application: Java Spark SQL Example
18/11/12 11:16:25 INFO SecurityManager: Changing view acls to: vkumar
18/11/12 11:16:25 INFO SecurityManager: Changing modify acls to: vkumar
18/11/12 11:16:25 INFO SecurityManager: Changing view acls groups to:
18/11/12 11:16:25 INFO SecurityManager: Changing modify acls groups to:
18/11/12 11:16:25 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(vkumar); groups with view permissions: Set(); users with modify permissions: Set(vkumar); groups with modify permissions: Set()
18/11/12 11:16:26 INFO Utils: Successfully started service 'sparkDriver' on port 54382.
18/11/12 11:16:26 INFO SparkEnv: Registering MapOutputTracker
18/11/12 11:16:26 INFO SparkEnv: Registering BlockManagerMaster
18/11/12 11:16:26 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/11/12 11:16:26 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/11/12 11:16:26 INFO DiskBlockManager: Created local directory at /private/var/folders/32/7klbv5_94wbddn9kgzvbwnkr0000gn/T/blockmgr-ddd2a79d-7fae-48e4-9658-1a8e2a8bb734
18/11/12 11:16:26 INFO MemoryStore: MemoryStore started with capacity 4.1 GB
18/11/12 11:16:26 INFO SparkEnv: Registering OutputCommitCoordinator
18/11/12 11:16:26 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/11/12 11:16:26 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://kirmac133.domain.com:4040
18/11/12 11:16:26 INFO Executor: Starting executor ID driver on host localhost
18/11/12 11:16:26 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 54383.
18/11/12 11:16:26 INFO NettyBlockTransferService: Server created on kirmac133.domain.com:54383
18/11/12 11:16:26 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
18/11/12 11:16:26 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO BlockManagerMasterEndpoint: Registering block manager kirmac133.domain.com:54383 with 4.1 GB RAM, BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/Users/vkumar/Documents/intellij/tmvalidator/spark-warehouse').
18/11/12 11:16:26 INFO SharedState: Warehouse path is 'file:/Users/vkumar/Documents/intellij/tmvalidator/spark-warehouse'.
18/11/12 11:16:27 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
18/11/12 11:16:28 INFO FileSourceStrategy: Pruning directories with:
18/11/12 11:16:28 INFO FileSourceStrategy: Post-Scan Filters: (length(trim(value#0, None)) > 0)
18/11/12 11:16:28 INFO FileSourceStrategy: Output Data Schema: struct<value: string>
18/11/12 11:16:28 INFO FileSourceScanExec: Pushed Filters:
18/11/12 11:16:29 INFO CodeGenerator: Code generated in 130.688148 ms
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 10582
at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:563)
at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.access$200(BytecodeReadingParanamer.java:338)
at com.thoughtworks.paranamer.BytecodeReadingParanamer.lookupParameterNames(BytecodeReadingParanamer.java:103)
at com.thoughtworks.paranamer.CachingParanamer.lookupParameterNames(CachingParanamer.java:90)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.getCtorParams(BeanIntrospector.scala:44)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1(BeanIntrospector.scala:58)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1$adapted(BeanIntrospector.scala:58)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240)
at scala.collection.Iterator.foreach(Iterator.scala:937)
at scala.collection.Iterator.foreach$(Iterator.scala:937)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1425)
at scala.collection.IterableLike.foreach(IterableLike.scala:70)
at scala.collection.IterableLike.foreach$(IterableLike.scala:69)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:240)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:237)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.findConstructorParam$1(BeanIntrospector.scala:58)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$19(BeanIntrospector.scala:176)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:32)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:29)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:194)
at scala.collection.TraversableLike.map(TraversableLike.scala:233)
at scala.collection.TraversableLike.map$(TraversableLike.scala:226)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:194)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14(BeanIntrospector.scala:170)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14$adapted(BeanIntrospector.scala:169)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240)
at scala.collection.immutable.List.foreach(List.scala:388)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:240)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:237)
at scala.collection.immutable.List.flatMap(List.scala:351)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.apply(BeanIntrospector.scala:169)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$._descriptorFor(ScalaAnnotationIntrospectorModule.scala:22)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$.fieldName(ScalaAnnotationIntrospectorModule.scala:30)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$.findImplicitPropertyName(ScalaAnnotationIntrospectorModule.scala:78)
at com.fasterxml.jackson.databind.introspect.AnnotationIntrospectorPair.findImplicitPropertyName(AnnotationIntrospectorPair.java:467)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector._addFields(POJOPropertiesCollector.java:351)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.collectAll(POJOPropertiesCollector.java:283)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.getJsonValueMethod(POJOPropertiesCollector.java:169)
at com.fasterxml.jackson.databind.introspect.BasicBeanDescription.findJsonValueMethod(BasicBeanDescription.java:223)
at com.fasterxml.jackson.databind.ser.BasicSerializerFactory.findSerializerByAnnotations(BasicSerializerFactory.java:348)
at com.fasterxml.jackson.databind.ser.BeanSerializerFactory._createSerializer2(BeanSerializerFactory.java:210)
at com.fasterxml.jackson.databind.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:153)
at com.fasterxml.jackson.databind.SerializerProvider._createUntypedSerializer(SerializerProvider.java:1203)
at com.fasterxml.jackson.databind.SerializerProvider._createAndCacheUntypedSerializer(SerializerProvider.java:1157)
at com.fasterxml.jackson.databind.SerializerProvider.findValueSerializer(SerializerProvider.java:481)
at com.fasterxml.jackson.databind.SerializerProvider.findTypedValueSerializer(SerializerProvider.java:679)
at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:107)
at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)
at org.apache.spark.rdd.RDDOperationScope.toJson(RDDOperationScope.scala:52)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:142)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:339)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3384)
at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2545)
at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:3365)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:78)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3365)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2545)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2759)
at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.infer(CSVDataSource.scala:232)
at org.apache.spark.sql.execution.datasources.csv.CSVDataSource.inferSchema(CSVDataSource.scala:68)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:63)
at org.apache.spark.sql.execution.datasources.DataSource.$anonfun$getOrInferFileFormatSchema$12(DataSource.scala:183)
at scala.Option.orElse(Option.scala:289)
at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:180)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:373)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at kir.com.tmvalidator.Validator.init(Validator.java:30)
at kir.com.tmvalidator.Home.main(Home.java:7)
18/11/12 11:16:29 INFO SparkContext: Invoking stop() from shutdown hook
18/11/12 11:16:29 INFO SparkUI: Stopped Spark web UI at http://kirmac133.domain.com:4040
18/11/12 11:16:29 INFO ContextCleaner: Cleaned accumulator 4
18/11/12 11:16:29 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/11/12 11:16:29 INFO MemoryStore: MemoryStore cleared
18/11/12 11:16:29 INFO BlockManager: BlockManager stopped
18/11/12 11:16:29 INFO BlockManagerMaster: BlockManagerMaster stopped
18/11/12 11:16:29 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/11/12 11:16:29 INFO SparkContext: Successfully stopped SparkContext
18/11/12 11:16:29 INFO ShutdownHookManager: Shutdown hook called
18/11/12 11:16:29 INFO ShutdownHookManager: Deleting directory /private/var/folders/32/7klbv5_94wbddn9kgzvbwnkr0000gn/T/spark-1bde2bb5-603d-4fae-9128-7d92f259077b
java apache-spark hive apache-spark-sql
|
show 3 more comments
I am new to Spark and trying to read a CSV file using Java maven project but getting ArrayIndexOutOfBoundsException error.
Dependencies:
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>2.4.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>2.4.0</version>
</dependency>
</dependencies>
Code:
SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL Example")
.config("spark.master", "local")
.getOrCreate();
//Read file
Dataset<Row> df = spark.read()
.format("csv")
.load("test.csv");
CSV File
name,code
A,1
B,3
C,5
Here is stacktrace
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/11/12 11:16:25 INFO SparkContext: Running Spark version 2.4.0
18/11/12 11:16:25 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/11/12 11:16:25 INFO SparkContext: Submitted application: Java Spark SQL Example
18/11/12 11:16:25 INFO SecurityManager: Changing view acls to: vkumar
18/11/12 11:16:25 INFO SecurityManager: Changing modify acls to: vkumar
18/11/12 11:16:25 INFO SecurityManager: Changing view acls groups to:
18/11/12 11:16:25 INFO SecurityManager: Changing modify acls groups to:
18/11/12 11:16:25 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(vkumar); groups with view permissions: Set(); users with modify permissions: Set(vkumar); groups with modify permissions: Set()
18/11/12 11:16:26 INFO Utils: Successfully started service 'sparkDriver' on port 54382.
18/11/12 11:16:26 INFO SparkEnv: Registering MapOutputTracker
18/11/12 11:16:26 INFO SparkEnv: Registering BlockManagerMaster
18/11/12 11:16:26 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/11/12 11:16:26 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/11/12 11:16:26 INFO DiskBlockManager: Created local directory at /private/var/folders/32/7klbv5_94wbddn9kgzvbwnkr0000gn/T/blockmgr-ddd2a79d-7fae-48e4-9658-1a8e2a8bb734
18/11/12 11:16:26 INFO MemoryStore: MemoryStore started with capacity 4.1 GB
18/11/12 11:16:26 INFO SparkEnv: Registering OutputCommitCoordinator
18/11/12 11:16:26 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/11/12 11:16:26 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://kirmac133.domain.com:4040
18/11/12 11:16:26 INFO Executor: Starting executor ID driver on host localhost
18/11/12 11:16:26 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 54383.
18/11/12 11:16:26 INFO NettyBlockTransferService: Server created on kirmac133.domain.com:54383
18/11/12 11:16:26 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
18/11/12 11:16:26 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO BlockManagerMasterEndpoint: Registering block manager kirmac133.domain.com:54383 with 4.1 GB RAM, BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/Users/vkumar/Documents/intellij/tmvalidator/spark-warehouse').
18/11/12 11:16:26 INFO SharedState: Warehouse path is 'file:/Users/vkumar/Documents/intellij/tmvalidator/spark-warehouse'.
18/11/12 11:16:27 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
18/11/12 11:16:28 INFO FileSourceStrategy: Pruning directories with:
18/11/12 11:16:28 INFO FileSourceStrategy: Post-Scan Filters: (length(trim(value#0, None)) > 0)
18/11/12 11:16:28 INFO FileSourceStrategy: Output Data Schema: struct<value: string>
18/11/12 11:16:28 INFO FileSourceScanExec: Pushed Filters:
18/11/12 11:16:29 INFO CodeGenerator: Code generated in 130.688148 ms
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 10582
at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:563)
at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.access$200(BytecodeReadingParanamer.java:338)
at com.thoughtworks.paranamer.BytecodeReadingParanamer.lookupParameterNames(BytecodeReadingParanamer.java:103)
at com.thoughtworks.paranamer.CachingParanamer.lookupParameterNames(CachingParanamer.java:90)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.getCtorParams(BeanIntrospector.scala:44)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1(BeanIntrospector.scala:58)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1$adapted(BeanIntrospector.scala:58)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240)
at scala.collection.Iterator.foreach(Iterator.scala:937)
at scala.collection.Iterator.foreach$(Iterator.scala:937)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1425)
at scala.collection.IterableLike.foreach(IterableLike.scala:70)
at scala.collection.IterableLike.foreach$(IterableLike.scala:69)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:240)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:237)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.findConstructorParam$1(BeanIntrospector.scala:58)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$19(BeanIntrospector.scala:176)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:32)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:29)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:194)
at scala.collection.TraversableLike.map(TraversableLike.scala:233)
at scala.collection.TraversableLike.map$(TraversableLike.scala:226)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:194)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14(BeanIntrospector.scala:170)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14$adapted(BeanIntrospector.scala:169)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240)
at scala.collection.immutable.List.foreach(List.scala:388)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:240)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:237)
at scala.collection.immutable.List.flatMap(List.scala:351)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.apply(BeanIntrospector.scala:169)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$._descriptorFor(ScalaAnnotationIntrospectorModule.scala:22)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$.fieldName(ScalaAnnotationIntrospectorModule.scala:30)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$.findImplicitPropertyName(ScalaAnnotationIntrospectorModule.scala:78)
at com.fasterxml.jackson.databind.introspect.AnnotationIntrospectorPair.findImplicitPropertyName(AnnotationIntrospectorPair.java:467)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector._addFields(POJOPropertiesCollector.java:351)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.collectAll(POJOPropertiesCollector.java:283)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.getJsonValueMethod(POJOPropertiesCollector.java:169)
at com.fasterxml.jackson.databind.introspect.BasicBeanDescription.findJsonValueMethod(BasicBeanDescription.java:223)
at com.fasterxml.jackson.databind.ser.BasicSerializerFactory.findSerializerByAnnotations(BasicSerializerFactory.java:348)
at com.fasterxml.jackson.databind.ser.BeanSerializerFactory._createSerializer2(BeanSerializerFactory.java:210)
at com.fasterxml.jackson.databind.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:153)
at com.fasterxml.jackson.databind.SerializerProvider._createUntypedSerializer(SerializerProvider.java:1203)
at com.fasterxml.jackson.databind.SerializerProvider._createAndCacheUntypedSerializer(SerializerProvider.java:1157)
at com.fasterxml.jackson.databind.SerializerProvider.findValueSerializer(SerializerProvider.java:481)
at com.fasterxml.jackson.databind.SerializerProvider.findTypedValueSerializer(SerializerProvider.java:679)
at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:107)
at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)
at org.apache.spark.rdd.RDDOperationScope.toJson(RDDOperationScope.scala:52)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:142)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:339)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3384)
at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2545)
at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:3365)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:78)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3365)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2545)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2759)
at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.infer(CSVDataSource.scala:232)
at org.apache.spark.sql.execution.datasources.csv.CSVDataSource.inferSchema(CSVDataSource.scala:68)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:63)
at org.apache.spark.sql.execution.datasources.DataSource.$anonfun$getOrInferFileFormatSchema$12(DataSource.scala:183)
at scala.Option.orElse(Option.scala:289)
at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:180)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:373)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at kir.com.tmvalidator.Validator.init(Validator.java:30)
at kir.com.tmvalidator.Home.main(Home.java:7)
18/11/12 11:16:29 INFO SparkContext: Invoking stop() from shutdown hook
18/11/12 11:16:29 INFO SparkUI: Stopped Spark web UI at http://kirmac133.domain.com:4040
18/11/12 11:16:29 INFO ContextCleaner: Cleaned accumulator 4
18/11/12 11:16:29 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/11/12 11:16:29 INFO MemoryStore: MemoryStore cleared
18/11/12 11:16:29 INFO BlockManager: BlockManager stopped
18/11/12 11:16:29 INFO BlockManagerMaster: BlockManagerMaster stopped
18/11/12 11:16:29 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/11/12 11:16:29 INFO SparkContext: Successfully stopped SparkContext
18/11/12 11:16:29 INFO ShutdownHookManager: Shutdown hook called
18/11/12 11:16:29 INFO ShutdownHookManager: Deleting directory /private/var/folders/32/7klbv5_94wbddn9kgzvbwnkr0000gn/T/spark-1bde2bb5-603d-4fae-9128-7d92f259077b
java apache-spark hive apache-spark-sql
1
Can you include the stack trace?
– shriyog
Nov 12 at 11:16
@shriyog - Added
– VK321
Nov 12 at 11:21
Are you getting this error of above-mentioned input in csv?
– shriyog
Nov 12 at 11:28
@shriyog - Yes :(
– VK321
Nov 12 at 11:28
I tested the same for2.3.0
versions, it works without issues. Checking with2.4.0
– shriyog
Nov 12 at 11:32
|
show 3 more comments
I am new to Spark and trying to read a CSV file using Java maven project but getting ArrayIndexOutOfBoundsException error.
Dependencies:
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>2.4.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>2.4.0</version>
</dependency>
</dependencies>
Code:
SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL Example")
.config("spark.master", "local")
.getOrCreate();
//Read file
Dataset<Row> df = spark.read()
.format("csv")
.load("test.csv");
CSV File
name,code
A,1
B,3
C,5
Here is stacktrace
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/11/12 11:16:25 INFO SparkContext: Running Spark version 2.4.0
18/11/12 11:16:25 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/11/12 11:16:25 INFO SparkContext: Submitted application: Java Spark SQL Example
18/11/12 11:16:25 INFO SecurityManager: Changing view acls to: vkumar
18/11/12 11:16:25 INFO SecurityManager: Changing modify acls to: vkumar
18/11/12 11:16:25 INFO SecurityManager: Changing view acls groups to:
18/11/12 11:16:25 INFO SecurityManager: Changing modify acls groups to:
18/11/12 11:16:25 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(vkumar); groups with view permissions: Set(); users with modify permissions: Set(vkumar); groups with modify permissions: Set()
18/11/12 11:16:26 INFO Utils: Successfully started service 'sparkDriver' on port 54382.
18/11/12 11:16:26 INFO SparkEnv: Registering MapOutputTracker
18/11/12 11:16:26 INFO SparkEnv: Registering BlockManagerMaster
18/11/12 11:16:26 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/11/12 11:16:26 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/11/12 11:16:26 INFO DiskBlockManager: Created local directory at /private/var/folders/32/7klbv5_94wbddn9kgzvbwnkr0000gn/T/blockmgr-ddd2a79d-7fae-48e4-9658-1a8e2a8bb734
18/11/12 11:16:26 INFO MemoryStore: MemoryStore started with capacity 4.1 GB
18/11/12 11:16:26 INFO SparkEnv: Registering OutputCommitCoordinator
18/11/12 11:16:26 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/11/12 11:16:26 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://kirmac133.domain.com:4040
18/11/12 11:16:26 INFO Executor: Starting executor ID driver on host localhost
18/11/12 11:16:26 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 54383.
18/11/12 11:16:26 INFO NettyBlockTransferService: Server created on kirmac133.domain.com:54383
18/11/12 11:16:26 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
18/11/12 11:16:26 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO BlockManagerMasterEndpoint: Registering block manager kirmac133.domain.com:54383 with 4.1 GB RAM, BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/Users/vkumar/Documents/intellij/tmvalidator/spark-warehouse').
18/11/12 11:16:26 INFO SharedState: Warehouse path is 'file:/Users/vkumar/Documents/intellij/tmvalidator/spark-warehouse'.
18/11/12 11:16:27 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
18/11/12 11:16:28 INFO FileSourceStrategy: Pruning directories with:
18/11/12 11:16:28 INFO FileSourceStrategy: Post-Scan Filters: (length(trim(value#0, None)) > 0)
18/11/12 11:16:28 INFO FileSourceStrategy: Output Data Schema: struct<value: string>
18/11/12 11:16:28 INFO FileSourceScanExec: Pushed Filters:
18/11/12 11:16:29 INFO CodeGenerator: Code generated in 130.688148 ms
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 10582
at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:563)
at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.access$200(BytecodeReadingParanamer.java:338)
at com.thoughtworks.paranamer.BytecodeReadingParanamer.lookupParameterNames(BytecodeReadingParanamer.java:103)
at com.thoughtworks.paranamer.CachingParanamer.lookupParameterNames(CachingParanamer.java:90)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.getCtorParams(BeanIntrospector.scala:44)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1(BeanIntrospector.scala:58)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1$adapted(BeanIntrospector.scala:58)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240)
at scala.collection.Iterator.foreach(Iterator.scala:937)
at scala.collection.Iterator.foreach$(Iterator.scala:937)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1425)
at scala.collection.IterableLike.foreach(IterableLike.scala:70)
at scala.collection.IterableLike.foreach$(IterableLike.scala:69)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:240)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:237)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.findConstructorParam$1(BeanIntrospector.scala:58)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$19(BeanIntrospector.scala:176)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:32)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:29)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:194)
at scala.collection.TraversableLike.map(TraversableLike.scala:233)
at scala.collection.TraversableLike.map$(TraversableLike.scala:226)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:194)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14(BeanIntrospector.scala:170)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14$adapted(BeanIntrospector.scala:169)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240)
at scala.collection.immutable.List.foreach(List.scala:388)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:240)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:237)
at scala.collection.immutable.List.flatMap(List.scala:351)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.apply(BeanIntrospector.scala:169)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$._descriptorFor(ScalaAnnotationIntrospectorModule.scala:22)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$.fieldName(ScalaAnnotationIntrospectorModule.scala:30)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$.findImplicitPropertyName(ScalaAnnotationIntrospectorModule.scala:78)
at com.fasterxml.jackson.databind.introspect.AnnotationIntrospectorPair.findImplicitPropertyName(AnnotationIntrospectorPair.java:467)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector._addFields(POJOPropertiesCollector.java:351)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.collectAll(POJOPropertiesCollector.java:283)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.getJsonValueMethod(POJOPropertiesCollector.java:169)
at com.fasterxml.jackson.databind.introspect.BasicBeanDescription.findJsonValueMethod(BasicBeanDescription.java:223)
at com.fasterxml.jackson.databind.ser.BasicSerializerFactory.findSerializerByAnnotations(BasicSerializerFactory.java:348)
at com.fasterxml.jackson.databind.ser.BeanSerializerFactory._createSerializer2(BeanSerializerFactory.java:210)
at com.fasterxml.jackson.databind.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:153)
at com.fasterxml.jackson.databind.SerializerProvider._createUntypedSerializer(SerializerProvider.java:1203)
at com.fasterxml.jackson.databind.SerializerProvider._createAndCacheUntypedSerializer(SerializerProvider.java:1157)
at com.fasterxml.jackson.databind.SerializerProvider.findValueSerializer(SerializerProvider.java:481)
at com.fasterxml.jackson.databind.SerializerProvider.findTypedValueSerializer(SerializerProvider.java:679)
at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:107)
at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)
at org.apache.spark.rdd.RDDOperationScope.toJson(RDDOperationScope.scala:52)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:142)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:339)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3384)
at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2545)
at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:3365)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:78)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3365)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2545)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2759)
at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.infer(CSVDataSource.scala:232)
at org.apache.spark.sql.execution.datasources.csv.CSVDataSource.inferSchema(CSVDataSource.scala:68)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:63)
at org.apache.spark.sql.execution.datasources.DataSource.$anonfun$getOrInferFileFormatSchema$12(DataSource.scala:183)
at scala.Option.orElse(Option.scala:289)
at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:180)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:373)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at kir.com.tmvalidator.Validator.init(Validator.java:30)
at kir.com.tmvalidator.Home.main(Home.java:7)
18/11/12 11:16:29 INFO SparkContext: Invoking stop() from shutdown hook
18/11/12 11:16:29 INFO SparkUI: Stopped Spark web UI at http://kirmac133.domain.com:4040
18/11/12 11:16:29 INFO ContextCleaner: Cleaned accumulator 4
18/11/12 11:16:29 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/11/12 11:16:29 INFO MemoryStore: MemoryStore cleared
18/11/12 11:16:29 INFO BlockManager: BlockManager stopped
18/11/12 11:16:29 INFO BlockManagerMaster: BlockManagerMaster stopped
18/11/12 11:16:29 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/11/12 11:16:29 INFO SparkContext: Successfully stopped SparkContext
18/11/12 11:16:29 INFO ShutdownHookManager: Shutdown hook called
18/11/12 11:16:29 INFO ShutdownHookManager: Deleting directory /private/var/folders/32/7klbv5_94wbddn9kgzvbwnkr0000gn/T/spark-1bde2bb5-603d-4fae-9128-7d92f259077b
java apache-spark hive apache-spark-sql
I am new to Spark and trying to read a CSV file using Java maven project but getting ArrayIndexOutOfBoundsException error.
Dependencies:
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>2.4.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>2.4.0</version>
</dependency>
</dependencies>
Code:
SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL Example")
.config("spark.master", "local")
.getOrCreate();
//Read file
Dataset<Row> df = spark.read()
.format("csv")
.load("test.csv");
CSV File
name,code
A,1
B,3
C,5
Here is stacktrace
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/11/12 11:16:25 INFO SparkContext: Running Spark version 2.4.0
18/11/12 11:16:25 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/11/12 11:16:25 INFO SparkContext: Submitted application: Java Spark SQL Example
18/11/12 11:16:25 INFO SecurityManager: Changing view acls to: vkumar
18/11/12 11:16:25 INFO SecurityManager: Changing modify acls to: vkumar
18/11/12 11:16:25 INFO SecurityManager: Changing view acls groups to:
18/11/12 11:16:25 INFO SecurityManager: Changing modify acls groups to:
18/11/12 11:16:25 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(vkumar); groups with view permissions: Set(); users with modify permissions: Set(vkumar); groups with modify permissions: Set()
18/11/12 11:16:26 INFO Utils: Successfully started service 'sparkDriver' on port 54382.
18/11/12 11:16:26 INFO SparkEnv: Registering MapOutputTracker
18/11/12 11:16:26 INFO SparkEnv: Registering BlockManagerMaster
18/11/12 11:16:26 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/11/12 11:16:26 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/11/12 11:16:26 INFO DiskBlockManager: Created local directory at /private/var/folders/32/7klbv5_94wbddn9kgzvbwnkr0000gn/T/blockmgr-ddd2a79d-7fae-48e4-9658-1a8e2a8bb734
18/11/12 11:16:26 INFO MemoryStore: MemoryStore started with capacity 4.1 GB
18/11/12 11:16:26 INFO SparkEnv: Registering OutputCommitCoordinator
18/11/12 11:16:26 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/11/12 11:16:26 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://kirmac133.domain.com:4040
18/11/12 11:16:26 INFO Executor: Starting executor ID driver on host localhost
18/11/12 11:16:26 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 54383.
18/11/12 11:16:26 INFO NettyBlockTransferService: Server created on kirmac133.domain.com:54383
18/11/12 11:16:26 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
18/11/12 11:16:26 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO BlockManagerMasterEndpoint: Registering block manager kirmac133.domain.com:54383 with 4.1 GB RAM, BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, kirmac133.domain.com, 54383, None)
18/11/12 11:16:26 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/Users/vkumar/Documents/intellij/tmvalidator/spark-warehouse').
18/11/12 11:16:26 INFO SharedState: Warehouse path is 'file:/Users/vkumar/Documents/intellij/tmvalidator/spark-warehouse'.
18/11/12 11:16:27 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
18/11/12 11:16:28 INFO FileSourceStrategy: Pruning directories with:
18/11/12 11:16:28 INFO FileSourceStrategy: Post-Scan Filters: (length(trim(value#0, None)) > 0)
18/11/12 11:16:28 INFO FileSourceStrategy: Output Data Schema: struct<value: string>
18/11/12 11:16:28 INFO FileSourceScanExec: Pushed Filters:
18/11/12 11:16:29 INFO CodeGenerator: Code generated in 130.688148 ms
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 10582
at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:563)
at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.access$200(BytecodeReadingParanamer.java:338)
at com.thoughtworks.paranamer.BytecodeReadingParanamer.lookupParameterNames(BytecodeReadingParanamer.java:103)
at com.thoughtworks.paranamer.CachingParanamer.lookupParameterNames(CachingParanamer.java:90)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.getCtorParams(BeanIntrospector.scala:44)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1(BeanIntrospector.scala:58)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1$adapted(BeanIntrospector.scala:58)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240)
at scala.collection.Iterator.foreach(Iterator.scala:937)
at scala.collection.Iterator.foreach$(Iterator.scala:937)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1425)
at scala.collection.IterableLike.foreach(IterableLike.scala:70)
at scala.collection.IterableLike.foreach$(IterableLike.scala:69)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:240)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:237)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.findConstructorParam$1(BeanIntrospector.scala:58)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$19(BeanIntrospector.scala:176)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:32)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:29)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:194)
at scala.collection.TraversableLike.map(TraversableLike.scala:233)
at scala.collection.TraversableLike.map$(TraversableLike.scala:226)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:194)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14(BeanIntrospector.scala:170)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14$adapted(BeanIntrospector.scala:169)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240)
at scala.collection.immutable.List.foreach(List.scala:388)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:240)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:237)
at scala.collection.immutable.List.flatMap(List.scala:351)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.apply(BeanIntrospector.scala:169)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$._descriptorFor(ScalaAnnotationIntrospectorModule.scala:22)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$.fieldName(ScalaAnnotationIntrospectorModule.scala:30)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$.findImplicitPropertyName(ScalaAnnotationIntrospectorModule.scala:78)
at com.fasterxml.jackson.databind.introspect.AnnotationIntrospectorPair.findImplicitPropertyName(AnnotationIntrospectorPair.java:467)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector._addFields(POJOPropertiesCollector.java:351)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.collectAll(POJOPropertiesCollector.java:283)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.getJsonValueMethod(POJOPropertiesCollector.java:169)
at com.fasterxml.jackson.databind.introspect.BasicBeanDescription.findJsonValueMethod(BasicBeanDescription.java:223)
at com.fasterxml.jackson.databind.ser.BasicSerializerFactory.findSerializerByAnnotations(BasicSerializerFactory.java:348)
at com.fasterxml.jackson.databind.ser.BeanSerializerFactory._createSerializer2(BeanSerializerFactory.java:210)
at com.fasterxml.jackson.databind.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:153)
at com.fasterxml.jackson.databind.SerializerProvider._createUntypedSerializer(SerializerProvider.java:1203)
at com.fasterxml.jackson.databind.SerializerProvider._createAndCacheUntypedSerializer(SerializerProvider.java:1157)
at com.fasterxml.jackson.databind.SerializerProvider.findValueSerializer(SerializerProvider.java:481)
at com.fasterxml.jackson.databind.SerializerProvider.findTypedValueSerializer(SerializerProvider.java:679)
at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:107)
at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)
at org.apache.spark.rdd.RDDOperationScope.toJson(RDDOperationScope.scala:52)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:142)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:339)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3384)
at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2545)
at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:3365)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:78)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3365)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2545)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2759)
at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.infer(CSVDataSource.scala:232)
at org.apache.spark.sql.execution.datasources.csv.CSVDataSource.inferSchema(CSVDataSource.scala:68)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:63)
at org.apache.spark.sql.execution.datasources.DataSource.$anonfun$getOrInferFileFormatSchema$12(DataSource.scala:183)
at scala.Option.orElse(Option.scala:289)
at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:180)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:373)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at kir.com.tmvalidator.Validator.init(Validator.java:30)
at kir.com.tmvalidator.Home.main(Home.java:7)
18/11/12 11:16:29 INFO SparkContext: Invoking stop() from shutdown hook
18/11/12 11:16:29 INFO SparkUI: Stopped Spark web UI at http://kirmac133.domain.com:4040
18/11/12 11:16:29 INFO ContextCleaner: Cleaned accumulator 4
18/11/12 11:16:29 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/11/12 11:16:29 INFO MemoryStore: MemoryStore cleared
18/11/12 11:16:29 INFO BlockManager: BlockManager stopped
18/11/12 11:16:29 INFO BlockManagerMaster: BlockManagerMaster stopped
18/11/12 11:16:29 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/11/12 11:16:29 INFO SparkContext: Successfully stopped SparkContext
18/11/12 11:16:29 INFO ShutdownHookManager: Shutdown hook called
18/11/12 11:16:29 INFO ShutdownHookManager: Deleting directory /private/var/folders/32/7klbv5_94wbddn9kgzvbwnkr0000gn/T/spark-1bde2bb5-603d-4fae-9128-7d92f259077b
java apache-spark hive apache-spark-sql
java apache-spark hive apache-spark-sql
edited Nov 12 at 11:20
asked Nov 12 at 11:14
VK321
2,07121524
2,07121524
1
Can you include the stack trace?
– shriyog
Nov 12 at 11:16
@shriyog - Added
– VK321
Nov 12 at 11:21
Are you getting this error of above-mentioned input in csv?
– shriyog
Nov 12 at 11:28
@shriyog - Yes :(
– VK321
Nov 12 at 11:28
I tested the same for2.3.0
versions, it works without issues. Checking with2.4.0
– shriyog
Nov 12 at 11:32
|
show 3 more comments
1
Can you include the stack trace?
– shriyog
Nov 12 at 11:16
@shriyog - Added
– VK321
Nov 12 at 11:21
Are you getting this error of above-mentioned input in csv?
– shriyog
Nov 12 at 11:28
@shriyog - Yes :(
– VK321
Nov 12 at 11:28
I tested the same for2.3.0
versions, it works without issues. Checking with2.4.0
– shriyog
Nov 12 at 11:32
1
1
Can you include the stack trace?
– shriyog
Nov 12 at 11:16
Can you include the stack trace?
– shriyog
Nov 12 at 11:16
@shriyog - Added
– VK321
Nov 12 at 11:21
@shriyog - Added
– VK321
Nov 12 at 11:21
Are you getting this error of above-mentioned input in csv?
– shriyog
Nov 12 at 11:28
Are you getting this error of above-mentioned input in csv?
– shriyog
Nov 12 at 11:28
@shriyog - Yes :(
– VK321
Nov 12 at 11:28
@shriyog - Yes :(
– VK321
Nov 12 at 11:28
I tested the same for
2.3.0
versions, it works without issues. Checking with 2.4.0
– shriyog
Nov 12 at 11:32
I tested the same for
2.3.0
versions, it works without issues. Checking with 2.4.0
– shriyog
Nov 12 at 11:32
|
show 3 more comments
2 Answers
2
active
oldest
votes
The paranamer version referred in Spark 2.4.0 is 2.7 which is causing the issue. Adding following dependency before the spark-core/ spark-sql solved the issue for me.
<dependency>
<groupId>com.thoughtworks.paranamer</groupId>
<artifactId>paranamer</artifactId>
<version>2.8</version>
</dependency>
add a comment |
I meet the same problem, but when I tried scala, this problem solved!
val tp = StructType(Seq(StructField("name", DataTypes.StringType), StructField("code", DataTypes.IntegerType)))
val df = spark.read.format("csv").option("header", "true").schema(tp).load("test.csv")
yes for now its working for me on 2.3.0
– VK321
Nov 13 at 6:15
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260980%2fjava-sparksql-2-4-0-arrayindexoutofboundsexception-error%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
The paranamer version referred in Spark 2.4.0 is 2.7 which is causing the issue. Adding following dependency before the spark-core/ spark-sql solved the issue for me.
<dependency>
<groupId>com.thoughtworks.paranamer</groupId>
<artifactId>paranamer</artifactId>
<version>2.8</version>
</dependency>
add a comment |
The paranamer version referred in Spark 2.4.0 is 2.7 which is causing the issue. Adding following dependency before the spark-core/ spark-sql solved the issue for me.
<dependency>
<groupId>com.thoughtworks.paranamer</groupId>
<artifactId>paranamer</artifactId>
<version>2.8</version>
</dependency>
add a comment |
The paranamer version referred in Spark 2.4.0 is 2.7 which is causing the issue. Adding following dependency before the spark-core/ spark-sql solved the issue for me.
<dependency>
<groupId>com.thoughtworks.paranamer</groupId>
<artifactId>paranamer</artifactId>
<version>2.8</version>
</dependency>
The paranamer version referred in Spark 2.4.0 is 2.7 which is causing the issue. Adding following dependency before the spark-core/ spark-sql solved the issue for me.
<dependency>
<groupId>com.thoughtworks.paranamer</groupId>
<artifactId>paranamer</artifactId>
<version>2.8</version>
</dependency>
answered Nov 20 at 4:24
Shasankar
18719
18719
add a comment |
add a comment |
I meet the same problem, but when I tried scala, this problem solved!
val tp = StructType(Seq(StructField("name", DataTypes.StringType), StructField("code", DataTypes.IntegerType)))
val df = spark.read.format("csv").option("header", "true").schema(tp).load("test.csv")
yes for now its working for me on 2.3.0
– VK321
Nov 13 at 6:15
add a comment |
I meet the same problem, but when I tried scala, this problem solved!
val tp = StructType(Seq(StructField("name", DataTypes.StringType), StructField("code", DataTypes.IntegerType)))
val df = spark.read.format("csv").option("header", "true").schema(tp).load("test.csv")
yes for now its working for me on 2.3.0
– VK321
Nov 13 at 6:15
add a comment |
I meet the same problem, but when I tried scala, this problem solved!
val tp = StructType(Seq(StructField("name", DataTypes.StringType), StructField("code", DataTypes.IntegerType)))
val df = spark.read.format("csv").option("header", "true").schema(tp).load("test.csv")
I meet the same problem, but when I tried scala, this problem solved!
val tp = StructType(Seq(StructField("name", DataTypes.StringType), StructField("code", DataTypes.IntegerType)))
val df = spark.read.format("csv").option("header", "true").schema(tp).load("test.csv")
answered Nov 13 at 6:14
LiJianing
373
373
yes for now its working for me on 2.3.0
– VK321
Nov 13 at 6:15
add a comment |
yes for now its working for me on 2.3.0
– VK321
Nov 13 at 6:15
yes for now its working for me on 2.3.0
– VK321
Nov 13 at 6:15
yes for now its working for me on 2.3.0
– VK321
Nov 13 at 6:15
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260980%2fjava-sparksql-2-4-0-arrayindexoutofboundsexception-error%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Can you include the stack trace?
– shriyog
Nov 12 at 11:16
@shriyog - Added
– VK321
Nov 12 at 11:21
Are you getting this error of above-mentioned input in csv?
– shriyog
Nov 12 at 11:28
@shriyog - Yes :(
– VK321
Nov 12 at 11:28
I tested the same for
2.3.0
versions, it works without issues. Checking with2.4.0
– shriyog
Nov 12 at 11:32