Skip to main content

Posts

Showing posts from February, 2021

Copy code of Git Repo in to a different Git Repo with History Commits

  1) Git clone  git clone <url to Source repo> temp-dir 2) Check different branches git branch -a 3) Checkout all the branches that you want to copy git checkout branch-name 4) fetch all the tags  git fetch --tags 5) Clear the link to Source repo git remote rm origin 6) Link your local repository to your newly created NEW repository git remote add origin <url to NEW repo> 7) Push all your branches and tags with these commands: git push origin --all git push --tags 8) Above steps complete copy from Source repo to New repo

Spark -Teradata connection Issues

Exception:  Caused by: java.lang.NullPointerException         at com.teradata.tdgss.jtdgss.TdgssConfigApi.GetMechanisms(Unknown Source)         at com.teradata.tdgss.jtdgss.TdgssManager.<init>(Unknown Source)         at com.teradata.tdgss.jtdgss.TdgssManager.<clinit>(Unknown Source) Brief: tdgssconfig.jar can't be found on the classpath. Please add same on classpath. Exception:  java.sql.SQLException: [Teradata Database] [TeraJDBC 15.10.00.33] [Error 3707] [SQLState 42000] Syntax error, expected something like a name or a Unicode delimited identifier or an 'UDFCALLNAME' keyword or '(' between the 'FROM' keyword and the 'SELECT' keyword. Brief: Normally Spark JDBC expects DBTable property to be a Table Name. So, internally it prepends "select * from" to Table Name. Like  select * from <Table Name> But, If we specify SQL instead of  Table Name then internally SQL will become something like: select * from select ... ; Above m

Splunk Data to Hadoop Ingestion

One of the approach to get data from Splunk to Hadoop is to use REST API provided by Splunk. Such that periodically data is ingested to Hadoop Data Lake.  Simple command like below can help in such scenario: curl  -u '<username>:<password>' \    -k https://splunkhost:8089/services/search/jobs/export \   -d search="search index=myindex | head 10" \   -d output_mode=raw \    | hdfs dfs -put -f - <HDFS_DIR>    Above command will get top 10 rows from Splunk index "myindex" and will ingest it to Hadoop Data Lake