In this post I will show, how to report jmx metrics to logstash via TCP on a push based way, without changing java code from an existing application.
Performance
When running Spark 1.6 on yarn clusters, i ran into problems, when yarn preempted spark containers and then the spark job failed. This happens only sometimes, when yarn used a fair scheduler and other queues with a higher priority submitted…
In my previous post i showed how to increase the parallelism of spark processing by increasing the number of executors on the cluster. In this post i will try to show how to distribute the data in a way, that the cluster…
In Apache Spark the key to get performance is parallelism. The first thing to get parallelism is to get the partition count to a good level, as the partition is the atom of each job. Reaching a good level of…
OpenCL is a GPGPU (General-purpose computing on graphics processing units) framework, which allows us to use the GPU for massive parallel programming. Compared to OpenGL, which is API for (3D) computer graphics, OpenCL defines everything more general to let non-graphics applications benefit…