Friday, 15 July 2016

Byte patterns - Human DNA

Hi all,

So you may have read my previous blog on counting bytes in a file. Well, I figured that as DNA is just a combination of letters (which stand for the chemicals involved) I figured I could see the ratio of the different chemicals by using my code. Would I find anything interesting?

So here they are:

X-Chromosome

Y-Chromosome


As you can see, not a lot 'interesting'. I was hoping that there would be some definite differences between the two. Saying that however, maybe the similarities and the ratio are the interesting thing about the result.

Oh, in case you are wondering. I haven't discovered a new chemical involved in DNA. That's actually line breaks!

Thanks for reading!




Friday, 8 July 2016

More Pratting Around With Colour and Data...

Hi all,

So there I was trying to work out why data is so huge, trying to work out a way of handling it better, when I had a thought. What actually is data?

Data, in simplest terms, is a string of bytes. In order for that data to be useful we put those bytes into a logical order to describe things such as text, images and the like inside a file. Normally these files have headers and other such human recognisable features to describe how the computer readable data is laid out inside the file.

I digress. Data is (currently) arranged in files for us humans to look at, they have structure, all future files will have structure, so I want to predict what those structures will be. To that end I started playing.

Plain Text file


I wrote a script in Python that can count the occurences of byte values. Its output is a png where the colours tendancy towards red shows its higher frequency compared to the total number of bytes in the file.

As you can see byte values that correspond with language based ascii characters feature heavily. Now what happens if we zip it?

Zipped Plain Text


As you can see the image is a lot flatter. This is because of the way zipping a file works. Essentially it records 1 or a number of bytes and then back references on the next occurance. I can now prove that the maths for different types of compression is different.

7z file


See? Proof positive something else is going on. Ok, ok, but it is interesting isn't it? How it's flattened out? It's something I've noticed with other file types too. Here are some below:

JPG file.

PDF file

MSI file




The smaller the file, the higher the compression, the more lossy the data, the more flat it is.

Nobbing around with data is something I've been doing for years. I am a great believer that we have already reached "L-Space". Everything ever written, ever been written, ever will be written is already out there. With the birth of huge datastores and the internet it should have transpired that every combination of bytes will have been written somewhere.

Thanks for reading,



Friday, 1 July 2016

Hybrid mind from hybrid cloud

Hi all,

The human mind is one of the greatest achievements in the natural world, it's power and utility are still not entirely known, it's limits unfathomable.
There are true geniuses, ones who can consciously direct their thoughts and imagine the universe of the very small and the very large, ones who can describe the beauty of nature in pure mathematics. There are those of us who are terrible at maths, but subconsciously your mind can estimate the speed of objects, the distance required to travel, the angle of the road, any physical limitations of movement to cross a road or catch a ball. It can also prioritise very well. Do we need to concentrate on the Lion in the bushes, or the stone in our shoe? It is a fantastic machine.

But it is not without its limitations. Yes, sometimes the mathematics ability is hidden from us. We can instinctively cross the road or catch a ball, but times tables are still a mystery. It's affected by mood, emotions, tiredness and more importantly, and what I am to focus on, distraction and procrastination.

First though, a quick leap to computing. Computers are one of the greatest inventions of the human race. The power and speed of processors has increased at a near exponential rate. The speed of storage has increased by a phenomenal amount too, since the invention of punch tape or physical switches. In the early days, we were limited to doing things one after the other, but then RAM and multi-core and multi-threaded processors came along. We could do more at once. No longer limited to sequential processing or access, things could be done at the same time and delivered faster.

And here is the rub.

In the IT sphere we have been, for decades now, trying to speed up computers. They are blisteringly fast. Any question you want answered can be retrieved very quickly. Any communication will arrive instantly and pop up on our screen. Email, Instant Messaging, Facebook, that report we wanted, that spreadsheet thats taking a while. All popping up as soon as they are ready. Each fighting for our attention. Each requiring our poor brains to switch focus again and again and again.

This is wrong.

We created computers so we could tell them what to do for us. Now they are driving. All these pop ups and alerts, each one a major distraction, each one has to be prioritised and thought about by our minds before we decide if we need to click on that report that's just come in. Do you need to open that report, or reply to Kevin who has just IM'd you? Do you need to reply to that facebook comment? Do you need to answer that email right now? Even if you don't need to talk to Kevin you still need to open the chat window to prioritise it against other things that have 'just popped up'.

My suggestion is that for our normal devices, our desktops and laptops and pads and phones and All in One's and smart watches (an ad nauseum list in the 21st century), we stop trying to make them faster. We need to make them "smarter". More like us. More for us. More like the way we think.

In order for this to happen we need to become more like the computers from yesteryear. More serial, more focussed. Like we used to be, when we used to make, build and hunt. We need to rest our conscious mind by providing things in a serial manner one job at a time. This will stop distractions. Less stress from swapping thought will also enable our subconcious mind to solve other issues, instead of trying to remember what that last email said. We don't need our computers to be fast, we need them to deliver the answer when we are ready.

At this point, we don't need our technology to be faster, but we do need to become more symbiotic with our technology. Our technology needs to be more sympathetic to our needs, our energy and our minds. Our computers should know when we are ready for another distraction; another report, another message, another picture of a cat even. Instead of the constant bonging, binging, pinging, popups and other distractions that cause our attention to be dragged away from our focus.

Thanks for reading

Saturday, 25 May 2013

How to make the Internet Beautiful. The Code.

Hi All,

Ok, so hopefully you've read the other blog or seen the stuff on youtube and followed a link here or whatever. If not, go back and have a look. Also, this blog is a little bit technical. So if you're interested in the effect rather than the cause, look away now.

Images. Images everywhere. But how did I make mine?

Well at first, I decided that I would read the HTML and use that to generate image data. The smallest unit of image data is the pixel, and we need to give it at least 5 values. These are:

Red (0-255)
Green (0-255)
Blue (0-255)
X (pos)
Y (pos)

Lets ignore X,Y as they are the easiest to sort out. However, we do need to worry about RGB.

Lets go back to some HTML code...

<html><head>I love html</head><body>I really love html</body></html>
Here is some example code. Gotta love it right?

Every character there is represented by a value in something called ASCII. These are somewhere between 0 and 255. For example, a capital 'A' has a value of 65. I could have just used that value to describe either a red, green or blue value. But the thing is, reading through my code, I wanted to really test myself at 2am and write some superfluous lines of code. I change ASCII into HEX.

HEX describes numbers using the format 0-9, A-F where A is 10, B is 11 and so on up to F which is 15. 16 in HEX is 10. AA in HEX is 26. As ASCII characters are 0-255, we need 2 HEX characters for each 1 ASCII character.

My code then, has to loop from the beginning, to the end, choosing 2 characters per colour. 6 HEX characters describing all 3 colours. Red, Green and Blue.

 Once it has done that, its a simple case of changing it back down to DEC (our "normal" numbering system) so I can pass it to GD to write the pixel.

Before we get to, now I look at it, rather rushed and over complicated code, we can encode the example:




Anti-aliasing has helped a lot here. Don't forget, it takes 3 characters to make 1 pixel. There are only 16 pixels in that image. It's a 4x4 blown up really big.

Here's the code:

#!/usr/bin/perl 
#use strict;
#use warnings;
 use GD;
 use HTTP::Lite;
 use String::HexConvert ':all';

$filename=0;
while ($monkey == true){
$filename ++;
my $hexcode = "";
my $twitcode = "";
my $x=0; # x coord
my $y=0; # y coord
my $i=0; #loop var
my $row =0;
my $weight=0;
my $r =0;
my $g =0;
my $b =0;

#my $im = new GD::Image(20,10);

    $http = HTTP::Lite->new;
    #$req = $http->request("http://wemakeawesomesh.it/make")
    #$req = $http->request("http://search.twitter.com/search.json?q=%23BeliebersAreHereForJustin") 
    $req = $http->request("http://theregister.co.uk/index.html") 
    #$req = $http->request("http://feeds.bbci.co.uk/news/rss.xml")
    

        or die "Unable to get document: $!";
$twitcode = $http->body();

 $hexcode = ascii_to_hex($twitcode);

$size = sqrt((length($hexcode) / 6) );
my $im = new GD::Image($size,$size,1);
#print $hexcode;
for ($i=0;$i <= length($hexcode);$i = $i + 2){ 
$row ++;
$weight ++;
print "\n\nWEIGHT:  $weight CHARPOS:  $i \n";
if ($weight == 1){
$r = hex(substr($hexcode, $i, 2));
print "\n" . substr($hexcode, $i, 2);
}
if ($weight == 2){
$g = hex(substr($hexcode, $i, 2));
print "\n" . substr($hexcode, $i, 2);
}
if ($weight == 3){
$b = hex(substr($hexcode, $i, 2));
print "\n" . substr($hexcode, $i, 2);
print "\nCOORDS: X: $x, Y: $y----- $r,$g,$b\n";
$cursorcolour = $im->colorAllocate($r,$g,$b); 
$im->GD::Image::setPixel($x,$y,$cursorcolour);
$x++;
                $weight=0;
if ($x >= $size){
                        $x=0;
                $y ++;
}
                
}

}
binmode STDOUT;
open (MYFILE, ">frame$filename.png");
print MYFILE $im->png;
close (MYFILE);
sleep(300);
}


Wow, that's ambitious for blogspot to handle isn't it?

A few things to mention here.
1. There is a wrapper that means it will loop forever. I last used this particular version of the code to collect twitter data every 5 mins. (Which you can see with the sleep(300) above)
2. I've left some examples in there.
3. I've done some dodgy maths to get the image size.
4. I've commented out the top 2 lines.

So please, take my code, have some fun. Rewrite it (take out the rubbish stuff like the string ASCII->HEX->HEX->DEC manipulation)

If you do improve it, let me know what you did.

I'll let you work out how I did the mp3 video, and the text-> music. It won't be hard now.

Thanks for reading,

How to Make the Internet Beautiful.

Hi All,

I was sat down bored the other day. The internet is the most bland, boring and unexciting place at the moment. It's dull.

So I wondered. Can I make the internet more interesting? And the answer, surprisingly, was yes.

I turned it into pictures, like this:



Now this looks amazing. All that from some data. I've been playing with this for a while, and here are a few videos of the results of my testing:





Now this is all well and good. I can change the internet into images and set them to music. (I'll deal with the 'how' in the next blog as it's quite detailed.)

It was then that I realised I could put absolutely anything through my encoder. So I wondered...Can I encode an mp3? The answer is yes. Yes I can. What's interesting is that you can see the file header and the tail at the end too.





After that I thought a bit more....I wonder what the internet would sound like? What would happen if I used audio as the output?

Well....here it is (TURN YOUR VOLUME DOWN FIRST!):



Feel free to have a listen and a watch or 3. Make some comments, or if you have any other ideas, let me know.

Thanks for reading!

Friday, 5 April 2013

Bash script to download Infinite Monkey Cage episodes.

Hi all,

It's been a while but I thought I might share this with you.
I am a fan of a BBC Radio 4 show called "The Infinite Monkey Cage". Hosted by Prof. Brian Cox no less.

Here is an over engineered bash script to download all their episodes:

wget http://www.bbc.co.uk/podcasts/series/timc/all/ ; cat index.html | grep -Po "timc\_[0-9]*\-[0-9]*[a-z]\.mp3" | sed "s/^/wget http\:\/\/downloads.bbc.co.uk\/podcasts\/radio4\/timc\//g" | sh ; yes | rm index.html

Download and enjoy!

Thanks for Reading

Tuesday, 25 September 2012

How to configure Apache2.2 as a Reverse Proxy

Hi All,

The other day I was asked to create a proxy. But not just any proxy. A proxy that could handle content and that could forward on and return login requests to a 3rd party.

Ok. So that bit was new. They have just asked me to configure a tiny Content Delivery Network.

So first things first. I cracked out the 12.04 server version of Ubuntu, installed it, configured it then installed Apache.

Nice and easy.

The reason why I chose Ubuntu over the other Linux builds is that the way Ubuntu package Apache makes it really, really easy to configure. I'm not going to go into which flavour of Linux is the best, neither do I care. This is because I AM A GROWN UP.

On with the show!

So you will need some bits and pieces to go with Apache. You will need the proxy, proxy_http and headers mods. You can install these by typing the following as root (or sudo):

a2enmod proxy
a2enmod proxy_http
a2enmod headers

Again, this is really easy stuff.

Now we have our mods installed lets go and configure something. First make sure that Apache is working.

sudo service apache2 restart
then point a browser at your server. You should get an 'It's Worked!' message.

Now for the config.
Go to /etc/apache2/sites-enabled
now use a text editor to edit 000-default.

Here is my config:

ProxyPreserveHost On
ProxyVia full
<proxy>
Order deny,allow
Allow from all
</proxy>

ProxyPass / http://xxx.xxx.xxx.xxx:8888/

ProxyPassReverse / http://xxx.xxx.xxx.xxx:8888/

Header edit location 192.168.1.2 192.168.1.2:81


Let's go through this bit by bit.

ProxyPreserveHost....this preserves the host IP in the headers
ProxyVia full...Adds the Via tag to the outgoing headers so we can see where the request came from

The next bit is the permissions for the proxy. Configure the permissions in this as if it was a website. Don't dump your proxy onto the internet with the settings above. It's not secure. This is instructions on how to make a proxy, not to make a secure proxy.

No we're into the fun bit.

ProxyPass / http://whateverIPorSite.com/

This makes incoming connections proxy out to whatever the target is.

ProxyPassReverse / http://whateverIPorSite.com/

This makes returning connections proxy back correctly

That is basically a proxy right there. It'll work. Restart Apache, point your browser at it and it'll work.
Now let's have some fun.

I have an authentication server somewhere up stream. It's a 3rd party service. My clients need to authenticate with that before getting the content they want. there are 2 streams to this. The authentication stream and the content stream. to make my life easier I decided that I would use 2 ports. Port 80 would handle authentication with the 3rd party provider and 81 would serve the content. to that end I created a new site on port 81 and put the content on it. Now we need to address that Header edit location line up there.

When you authenticate your 3rd party it should return a 302, in that return code you will get a header called "Location" this is where content is due to be served from. When you have a 3rd party site handling your authentication you don't necessarily want to download your content from them. You'd rather use Akamai or something. This means rewriting the header at the proxy, before it gets back to the client.

In my case I needed to change the port number. First the command:

Header edit Location

This tells the mod I want to edit one of the returning headers called Location.
Next I have: 192.168.1.2 192.168.1.2:81

What this does is tell the mod to replace the IP of the server, with the IP and the new port of the server. When the client gets the location header, it will pop off and download it from this source rather than the 3rd party authorisation provider.

Thats about it. Questions below.

thanks for reading.