You are here: Home » Computing
 

I’ve just updated the webpage content analyser as the version I originally uploaded was a devlopment version and didn’t actually work!

Anyway the new version still needs tweaking but it gives you an idea of the kind of thing it can produce. Syntax of invocation is still the same as is the download link, both of which can be found on the web content analyser page on the right hand menu.

 

At work we use Dansguardian to provide content based filtering for our users. We found in a few instances Dansguardian’s default phraselists was blocking content that we wanted our users to have access to, so we needed to modify them.

We had a problem though, we didn’t know what words over a set of pages were common so that we could use to modify our current phraselists. So I wrote a tool, WPA, that would take a webpage strip all the formating and give a list of most common phrases and words.

To use WPA do the following:

java -jar WPA.jar site1 site2 site3 …

You can also use it to see what words you should be banning if you have a set of bad pages you don’t want your users to be viewing. Using your blocked domains list is good for this to prevent access to similar content:

cat /etc/dansguardian/lists/global-block/domains | java -jar WPA.jar

All you need to do then is decide what phrases to add to your weighted phrase lists!

Of course the beauty of this program is that it’s cross platform thanks to Java, so you’ll be able to run this on Linux, Windows or Mac. If you want the source code for this then it’s all bundled up in the jar.

To get hold of the jar Click Here.

 

For those of you that don’t know I use Ubuntu as my main operating system; at home, at work and on my laptop.

I still however have applications, mostly games, that I still have to keep a windows install handy for. At the moment at home I dual boot between windows vista and Ubuntu Intrepid Ibex.

With all the advancement in virtualisation technology recently I thought I’d take a few of them for a spin and see if I could get any decent games to run on them. The ideal situation would have been being able to boot up into Ubuntu double click a shortcut and let the virtual machine present the game without the teletubbie xp background.

My main game of choice at the moment is Oblivion which at the time of writing is a few years old so you might have been excused for thinking that virtual machines would have the technology avaliable to them to either pass the 3D rendering to the host GPU or to at least emulate a 3D device in some way.

It appears in practice we couldn’t be further away.

I tried a windows xp virtual machine with 2GB of memory and 128 MB of video memory and oblivion crashed out with an error. To be fair to VirtualBox its seemless mode was very good apart from one glitch in drawing my background and I was impressed with its seemless mouse interaction between host and guest OS.

I also tried to install Paraelles but the program itself wouldn’t actualy install. This was a shame because for the googling I’ve done paraelles seems to be the solution closest to having 3D acceleration working. I would have also though being a relatively smaller player in the VM market paraelles would want to make sure their solution installs on as many platforms and distributions as possible.

I did find a patch to fix the problem I encountered with paraelles but by the time I had done all that googling I didn’t have it in me to be bothered to patch and recompile. In a tough market like virtualisation products need to just work or people will just move on to another solution that will. Perhaps paraelles should sit up and take note? (according to my googling installing paraelles on ubuntu has always required more than average user knowledge to install so this is nothing new…)

For fairness I also tried installing Oblivion via Wine and Crossover games, neither of which worked.

I guess my days of dual booting will continue for a while to come yet, or until virtualisation solutions provide proper 3D acceleration support or the devs at wine fix the bugs that affect direct x games.

Get Adobe Flash playerPlugin by wpburn.com wordpress themes
© 2011 The Think Tank Suffusion theme by Sayontan Sinha