Java problem diagnosis tool–Greys-Anatomy

Introduce a very useful Java process monitoring tool, which can print out the average time and maximum time consumption of each method in the Java class, which is very helpful for troubleshooting delays in Java programs. The name of this tool is: Greys-Anatomy.
Greys positioning is a professional JVM business problem location tool. Since it is a JVM, most of the things we face are Java programmers. I hope to share all the skills and ideas I have when writing software, so that more Java programmers can participate in or benefit from development.

Main characteristics
View class and method information that has been loaded by the JVM

1, method execution monitoring

      Call volume, success failure rate, response time

2, the method performs data operations

      Input parameters, return value, exception information record and view; support action playback

3, performance overhead rendering

       Tracking method call trajectories in a specified path, taking time

View method call stack

Download and install
Reference: https://github.com/oldmanpushcart/greys-anatomy

After the installation is complete, you can write a shell script, similar to the following (my-gresy.sh, assuming the java program is in the tomcat container):

JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/
CLASS_PATH=.:$JRE_HOME/lib
PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
Export JAVA_HOME JRE_HOME CLASS_PATH PATH
Pid=$(ps -ef | grep -i tomcat-8.5-1 | grep -v 'grep'| awk '{print $2}')
/data/greys/greys.sh $pid@127.0.0.1:3258|tee -a /data/gresy/gresy.log

Execution script
> sh my-gresy.sh
The following interface appears

Then execute the monitor -c 5 com.my.api.* * command to print the monitoring information every 5 minutes, as follows:

By printing the information, you can find out which method takes the longest time to execute.

Shell scripts monitor server processes and ports

Shell scripts monitor server processes and ports
Recently learned shell programming, wrote a script that can monitor the port, PID, program name, etc. used by the current server; it can be used to find out if there are any ports that are not commonly used to be intercepted, and then judge whether it has been “engaged” by the hacker;

code show as below:

#tcp part
Port1=`netstat -an|grep LISTEN|egrep "0.0.0.0|:::"|awk '/^tcp/ {print $4}'|awk -F: '{print $2$4}'|sort -n`
Echo "TCP state:"
Echo "--------------------------------"
Echo "PORT PID COMMAND"
For a in $port1
Do
b=`lsof -n -i:$a|grep TCP|grep LISTEN|grep IPv4|awk '{printf("%d\t%s\n"),$2,$1}'`
Echo "$a $b"
Done
Echo "--------------------------------"

#udp part
Echo ""
Port2=`netstat -an|grep udp|awk '{print $4}'|awk -F: '{print $2}'|sed '/^$/d'|sort -n`
Echo "UDP state:"
Echo "--------------------------------"
Echo "PORT PID COMMAND"
For a in $port2
Do
b=`lsof -n -i:$a|grep UDP|grep IPv4|awk '{printf("%d\t%s\n"),$2,$1}'`
If [ -n "$b" ];then
Echo "$a $b"
Fi
Done
Echo "--------------------------------"

Exit 0

Use AWS SDK for Java 2.0,Set S3 object public access

Region region = Region.AP_SOUTHEAST_1;
        s3 = S3Client.builder().region(region).build();
        

        String bucket = "your bucket name";// "bucket" + System.currentTimeMillis();
        String key = "your file name";
	    
        //create bucket if it is not exist
        createBucket(bucket, region);

        // Put Object
        PutObjectRequest request = PutObjectRequest.builder().bucket(bucket).key(key).build();
        s3.putObject(request,RequestBody.fromByteBuffer(getRandomByteBuffer(10_000)));
 
        //set public acl
        PutObjectAclRequest putAclReq = PutObjectAclRequest.builder()
        		.bucket(bucket)
        		.key(key)
        		.acl(ObjectCannedACL.PUBLIC_READ)
        		//.accessControlPolicy(acl)
        		.build();
        s3.putObjectAcl(putAclReq);
        
	//set email acl
	 // get the current ACL
        	GetObjectAclRequest objectAclReq = GetObjectAclRequest.builder()
            		.bucket(bucket_name)
            		.key(object_key)
            		.build();
		String email = "your email adress";
        	GetObjectAclResponse getAclRes = s3.getObjectAcl(objectAclReq);
            Grantee grantee = Grantee.builder().emailAddress(email).build();
            Permission permission = Permission.valueOf(Permission.Read);
            List<Grant> grants = getAclRes.grants();
            Grant newGrantee = Grant.builder()
            		.grantee(grantee)
            		.permission(permission)
            		.build();
            grants.add(newGrantee);

            //put the new acl
            AccessControlPolicy acl = AccessControlPolicy.builder()
            		.grants(grants)
            		.build();
            PutObjectAclRequest putAclReq = PutObjectAclRequest.builder()
            		.bucket(bucket_name)
            		.key(object_key)
            		.accessControlPolicy(acl)
            		.build();
            s3.putObjectAcl(putAclReq);

Yii pseudo static

How to access static php files under the yii framework without having to create multiple actions, here is a simple record, I hope to leading to a better implementation:
1, configured in main.php

'urlManager'=>array(  
            'urlFormat'=>'path',  
            'showScriptName'=>false,  
            'rules'=>array(  
                  
'post/<view:.*>.html'=>'post/page/',  
  
  
                '<controller:\w+>/<action:\w+>'=>'<controller>/<action>',  
            ),  
        ), 

‘post/.html’=>’post/page/’ This line of code is the most important

2, implement a postController

<?php  
class PostController extends Controller{  
    public function actions() {  
        return array (  
                'page' => array (  
                        'class' => 'CViewAction'   
                )   
        );  
    }  
}  

3, add the post/pages directory in the corresponding views directory, and then add a static php file (such as 12345.php) in the pages directory
It can be accessed via http://domainname/post/12345.html. If there is a subdirectory (such as 20120920/123456.php), you can pass

Kafka Producer Performance Optimization

When we are talking about performance of Kafka Producer, we are really talking about two different things:

  • latency: how much time passes from the time KafkaProducer.send() was called until the message shows up in a Kafka broker.
  • throughput: how many messages can the producer send to Kafka each second.

Many years ago, I was in a storage class taught by scalability expert James Morle. One of the students asked why we need to worry about both latency and throughput – after all, if processing a message takes 10ms (latency), then clearly throughput is limited to 100 messages per second. When looking at things this way, it may look like higher latency == higher throughput. However, the relation between latency and throughput is not this trivial.

Lets start our discussion with agreeing that we are only talking about the new Kafka Producer (the one in org.apache.kafka.clients package). It makes things simpler and there’s no reason to use the old producer at this point.

Kafka Producer allows to send message batches. Suppose that due to network roundtrip times, it takes 2ms to send a single Kafka message. By sending one message at a time, we have latency of 2ms and throughput of 500 messages per second. But suppose that we are in no big hurry, and are willing to wait few milliseconds and send a larger batch – lets say we decided to wait 8ms and managed to accumulate 1000 messages. Our latency is now 10ms, but our throughput is up to 100,000 messages per second! Thats the main reason I love microbatches so much. By adding a tiny delay, and 10ms is usually acceptable even for financial applications, our throughput is 200 times greater. This type of trade-off is not unique to Kafka, btw. Network and storage subsystem use this kind of “micro batching” all the time.

Sometimes latency and throughput interact in even funnier ways. One day Ted Malaska complained that with Flafka, he can get 20ms latency when sending 100,000 messages per second, but huge 1-3s latency when sending just 100 messages a second. This made no sense at all, until we remembered that to save CPU, if Flafka doesn’t find messages to read from Kafka it will back off and retry later. Backoff times started at 0.5s and steadily increased. Ted kindly improved Flume to avoid this issue in FLUME-2729.

Anyway, back to the Kafka Producer. There are few settings you can modify to improve latency or throughput in Kafka Producer:

  • batch.size – This is an upper limit of how many messages Kafka Producer will attempt to batch before sending – specified in bytes (Default is 16K bytes – so 16 messages if each message is 1K in size). Kafka may send batches before this limit is reached (so latency doesn’t change by modifying this parameter), but will always send when this limit is reached. Therefore setting this limit too low will hurt throughput without improving latency. The main reason to set this low is lack of memory – Kafka will always allocate enough memory for the entire batch size, even if latency requirements cause it to send half-empty batches.
  • linger.ms – How long will the producer wait before sending in order to allow more messages to get accumulated in the same batch. Normally the producer will not wait at all, and simply send all the messages that accumulated while the previous send was in progress (2 ms in the example above), but as we’ve discussed, sometimes we are willing to wait a bit longer in order to improve the overall throughput at the expense of a little higher latency. In this case tuning linger.ms to a higher value will make sense. Note that if batch.size is low and the batch if full before linger.ms time passes, the batch will send early, so it makes sense to tune batch.size and linger.ms together.

Other than tuning these parameters, you will want to avoid waiting on the future of the send method (i.e. the result from Kafka brokers), and instead send data continuously to Kafka. You can simply ignore the result (if success of sending messages is not critical), but its probably better to use a callback. You can find an example of how to do this in my github (look at produceAsync method).

If sending is still slow and you are trying to understand what is going on, you will want to check if the send thread is fully utilized through jvisualsm (it is called kafka-producer-network-thread) or keep an eye on average batch size metric. If you find that you can’t fill the buffer fast enough and the sender is idle, you can try adding application threads that share the same producer and increase throughput this way.

Another concern can be that the Producer will send all the batches that go to the same broker together when at least one of them is full – if you have one very busy topic and others that are less busy, you may see some skew in throughput this way.

Sometimes you will notice that the producer performance doesn’t scale as you add more partitions to a topic. This can happen because, as we mentioned, there is a send buffer for each partition. When you add more partitions, you have more send buffers, so perhaps the configuration you set to keep the buffers full before (# of threads, linger.ms) is no longer sufficient and buffers are sent half-empty (check the batch sizes). In this case you will need to add threads or increase linger.ms to improve utilization and scale your throughput.

Got more tips on ingesting data into Kafka? comments are welcome!