Skip to main content

Posts

Spark MongoDB Write Error - com.mongodb.MongoBulkWriteException: Bulk write operation error on server 'E11000 duplicate key error collection:'

  One may see following error or exception, while running Spark 2.4 with -  mongo-spark-connector_2.11-2.4.0.jar mongo-java-driver-3.9.0.jar Exception -  User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 6.0 failed 4 times, most recent failure: Lost task 2.3 in stage 6.0 (TID 238, nc0020.hadoop.mycluster.com, executor 2): com.mongodb.MongoBulkWriteException: Bulk write operation error on server vondbd0008.mymachine.com:27017. Write errors: [BulkWriteError{index=0, code=11000, message='E11000 duplicate key error collection: POC1_DB.MyCollection index: _id_ dup key: { _id: "113442141" }', details={ }}]. at com.mongodb.connection.BulkWriteBatchCombiner.getError(BulkWriteBatchCombiner.java:177) at com.mongodb.connection.BulkWriteBatchCombiner.throwOnError(BulkWriteBatchCombiner.java:206) at com.mongodb.connection.BulkWriteBatchCombiner.getResult(BulkWriteBatchCombiner.java:147) at com.mongodb.operation.BulkWrite

Python pyodbc - Error - [unixODBC][Oracle][ODBC][Ora]ORA-12162: TNS:net service name is incorrectly specified

One may encounter below errors while connecting to oracle, using pyodbc, using python 3 [unixODBC][Driver Manager]Can't open lib 'Oracle ODBC driver for Oracle 19' : file not found (0) (SQLDriverConnect)  [unixODBC][Oracle][ODBC][Ora]ORA-12162: TNS:net service name is incorrectly specified\n (12162) (SQLDriverConnect) RuntimeError: Unable to set SQL_ATTR_CONNECTION_POOLING attribute The solution to fix above errors is to -  Make following entry in /etc/odbcinst.ini                  [Oracle ODBC driver for Oracle 19]                Description=Oracle ODBC driver for Oracle 19                Driver=$ORACLE_HOME/lib/libsqora.so.19.1                FileUsage=1                Driver Logging=7                UsageCount=1 Don't use following connect string -                 import pyodbc                myhost='<>'                myservicename='<>'                myuserid='<>'                mypassword='<>'               

CVE-2022-33891 Apache Spark Command Injection Vulnerability

  Please refer - https://spark.apache.org/security.html The command injection occurs because Spark checks the group membership of the user passed in the ?doAs parameter by using a raw Linux command. If an attacker is sending reverse shell commands using  ?doAs . There is also a high chance of granting apache spark server access to the attackers’ machine. Vulnerability description - The Apache Spark UI offers the possibility to enable ACLs via the configuration option spark.acls.enable. With an authentication filter, this checks whether a user has access permissions to view or modify the application. If ACLs are enabled, a code path in HttpSecurityFilter can allow someone to perform impersonation by providing an arbitrary user name. A malicious user might then be able to reach a permission check function that will ultimately build a Unix shell command based on their input, and execute it. This will result in arbitrary shell command execution as the user Spark is currently running as. V

Hive Metastore ER Diagram

 

Hadoop Distcp to HCP or AWS S3a leading to Error - com.amazonaws.SdkClientException: Unable to execute HTTP request: sun.security.validator.ValidatorException: PKIX path building failed

  Running Hadoop Distcp to copy data from S3a resulted in  below error -  **com.amazonaws.SdkClientException: Unable to execute HTTP request: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target” Stack trace: com.amazonaws.SdkClientException: Unable to execute HTTP request: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1114) ~[aws-java-sdk-core-1.11.280.jar!/:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1064) ~[aws-java-sdk-core-1.11.280.jar!/:?] To debug this error, turn SSL debug logging on   -Djavax.net.debug=all , or  -Djavax.net.debug=ssl Above parameters c