Chris's Blog

Devops Shokunin

Using JMX on Vagrant

Comments Off on Using JMX on Vagrant

Getting JMX remote from my desktop to a Vagrant machine took a few tries.

Vagrant file configuration to add more memory and forward the HTTP and JMX ports

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
 config.vm.box = "ubuntu/xenial64"
 config.vm.network "forwarded_port", guest: 8080, host: 8080
 config.vm.network "forwarded_port", guest: 9010, host: 9010
 config.vm.provider "virtualbox" do |vb|
   vb.customize ["modifyvm", :id, "--memory", "2048"]
 end
end

Application Start script needs to be modified to setup remote access

 java \
 -Dcom.sun.management.jmxremote \
 -Dcom.sun.management.jmxremote.port=9010 \
 -Dcom.sun.management.jmxremote.rmi.port=9010 \
 -Djava.rmi.server.hostname=127.0.0.1 \
 -Dcom.sun.management.jmxremote.authenticate=false \
 -Dcom.sun.management.jmxremote.ssl=false \
 -jar "${SOURCE_DIR}/bin/myjar.jar"

Connection is now possible by running:

jconsole 127.0.0.1:9010

When Jconsole starts up Select “Insecure Connection”
screen-shot-2017-03-02-at-11-09-50-am

Blogspam Analysis with R Part 1

Comments Off on Blogspam Analysis with R Part 1

This morning while checking the comments on this blog I was surprised at the amount of spam comments caught by the Akismet plugin, so I decided to dive in with some logfile analysis using R to see if I could lessen the scourge.

Grab the data from my nginx logs, since I get very few comments, we can assume that everything is spam.

echo '"IP", "DATE"' > ~/tmp/data_analysis/blogspam.csv
zgrep '/wp-comments-post.php' /var/log/nginx/acc* |
perl -ne 'if (m/.*:(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) \- \- \[(\d+)\/(\w+)\/2014/)
{print "\",$1, "\",\"", $3, "/", $2, "\"\n"}' >>
~/tmp/data_analysis/blogspam.csv

Install R and start it

$ sudo apt-get install -y r-base-core
$ R

Load the data into R

spammers  <- read.csv(file="blogspam.csv", head=TRUE,sep=",")

Let’s find the biggest IP and heavest days:

 > summary(spammers)
               IP             Date      
 1.1.1.1        : 2135   Sep/07 : 1364  
 2.2.2.2        : 2069   Oct/02 : 1353  
 3.3.3.3        : 1971   Oct/03 : 1348  
 4.4.4.4        : 1864   Sep/09 : 1344  
 5.5.5.5        : 1819   Oct/01 : 1333  
 6.6.6.6        : 1712   Sep/30 : 1328  
 (Other)        :50435   (Other):53935

Histogram by IP Frequency

iplist <- as.data.frame(table(spammers$IP))
hist(iplist$Freq, breaks=100, xlab="ip distribution",
      main="Spammer IPs",  col="darkblue")

IP_distribution

This shows that there is no single IP causing all of the trouble, so there is no simple solution of blocking a single IP.

Graph the number of spam comments per day.
Note: you need to sort the data by date or your lines will be all over the place and the graph unreadable

dates <- as.data.frame(table(spammers$Date))
datessorted <- dates[order(as.Date(dates$Var1,format = "%b/%d")),]
plot(as.POSIXct(datessorted$Var1,format = "%b/%d"),
  datessorted$Freq, main="spam comments", xlab="date", ylab="count", type="l")

timeseries

This gives me a basic idea of the problem and further analysis will be available in Part 2

Note: Since I sat down to write this post after clearing out the spam comments I now have 101 new spam comments.