QueryDB

Posts

Hive Count Query not working

Hive with Tez execution engine - count(*) not working , returning 0 results. Solution - set hive.compute.query.using.stats=false Refer - https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties hive.compute.query.using.stats Default Value: false Added In: Hive 0.13.0 with HIVE-5483 When set to true Hive will answer a few queries like min, max, and count(1) purely using statistics stored in the metastore. For basic statistics collection, set the configuration property hive.stats.autogather to true. For more advanced statistics collection, run ANALYZE TABLE queries.

AWS EMR Spark – Much Larger Executors are Created than Requested

Starting EMR 5.32 and EMR 6.2 you can notice that Spark can launch much larger executors that you request in your job settings. For example - We started a Spark Job with spark.executor.cores = 4 But, one can see that the executors with 20 cores (instead of 4 as defined by spark.executor.cores) were launched. The reason for allocating larger executors is that there is a AWS specific Spark option spark.yarn.heterogeneousExecutors.enabled (exists in EMR only, does not exist in Open Source Spark) that is set to true by default that combines multiple executor creation requests on the same node into a larger executor container. So as the result you have fewer executor containers than you expected, each of them has more memory and cores that you specified. If you disable this option (--conf "spark.yarn.heterogeneousExecutors.enabled=false"), EMR will create containers with the specified spark.executor.memory and spark.executor.cores settings and will not coalesce them into la

Java Spring - Write Custom Annotation for javax validations

Suppose, I have a model like below - public class A{ public List<String> phoneNumbers; } I want to apply @Max validation, so as to not allow phoneNumbers more then a certain limit. And, it is required to have this limit configurable in properties file. By Default, @Max takes static values. We can not pass value read from application.properties to this annotation. So, we can write custom annotation like below - import java.lang.annotation.Documented; import java.lang.annotation.ElementType; import java.lang.annotation.Retention; import java.lang.annotation.RetentionPolicy; import java.lang.annotation.Target; import javax.validation.Constraint; import javax.validation.Payload; @Target({ ElementType.FIELD }) @Retention(RetentionPolicy.RUNTIME) @Documented @Constraint(validatedBy = FieldValidator.class) public @interface MaxAllowedSize { String message() default "size of list must be less than or equal to %s"; Class<?>[] groups() default {}; Class<? exte

Spring JMS ActiveMQ - Listener Consumers getting hung or stuck after sometime

Spring JMS ActiveMQ - Listener Consumers getting hung or stuck after sometime i.e. the instances stop's consuming messages from AMQ after running for a while... On analyzing Thread Dump, we found following listener thread in parked status - "Listener-1" #51 prio=5 os_prio=0 tid=0x00007fbc85599800 nid=0x2d016 waiting on condition [0x00007fbb3cc89000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000b1da9348> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) at org.apache.activemq.transport.FutureResponse.getResult(FutureResponse.java:48) Thus, messages were no