Nevyn's Blog

Thursday, 29 January 2009

To tweak, or not to tweak? (mysql)

Assuming you have a cluster of mysql servers serving your application, and you have configured them to share the load, yada yada, your next step is probably going to be asking 'how can I improve the design of database schema?'

Here are some links which cover that field (may I remind you to backup your database before starting, follow them at your own risk, and you cannot hold me responsible for any resulting issues!)

This one illustrates tweeking a Joomla CMS application on a VPS (shared hosting server):
http://www.joomlaperformance.com/articles/server_related/tweaking_mysql_server_23_16.html

This one discusses the differences between innodb and myisam db engines:
http://worthposting.wordpress.com/2008/06/22/innodb-or-myisam/

This appears to be the command to convert a myisam (the default) table to an innodb one, if you wanted to persue that, but beware the monolithic file may need more proactive maintenance and monitoring (for corruption etc, but not fragmentation issues). Again BACKUP!!

-- "ALTER TABLE foo ENGINE=InnoDB"

This one discusses locking, and mysql transactions - no transactions on myisam though! (hence you might need innodb for that)
http://forums.whirlpool.net.au/forum-replies-archive.cfm/498624.html

There's also the problem of race conditions: two select queries (two mysql sessions, webrowsers etc) return the same row, which the application wants to update. The first to 'accepted, the second to 'rejected' - row or table locking will prevent them from writing the new value at the same time, but what happens if both processes complete in lockstep, first 'accept' ing, then 'reject'ing the same row - if only one possible outcome was acceptable? (in other words, a row cannot be accepted after it has been rejected, and vice versa.) We could prevent accidental overwrites by increasing the criteria of the where clause:

UPDATE `orders`
SET `status`='accepted'
-- the first where clause identifies the row
WHERE `order_id`='".$this->DB->quotestring($order_id)."
-- and the second helps prevent 'logical' data collisions!
AND `status` != 'rejected'';

Lastly, this one illustrates a number of different ways one could design a schema
"including discussions on schema architecture, common data access patterns, and replication/scale-out guidelines" using web 'tagging' as an example.

http://forge.mysql.com/wiki/TagSchema

Labels: , , , , , , ,

Tuesday, 20 January 2009

tmobile usb3g on ubuntu / eee

Updated May 2009!
I have been informed that after upgrading the eee to Ubuntu 9.04 the usb 3g stick no longer works. Even after recompiling the drivers against the kernel, and a few other attempts to solve it, it would appear that the USB-serial driver which the stick depended on is no longer in the kernel (can anyone verify?).

Current workaround for me is to stay on Ubuntu 8.10 a little while longer, and hope the manufacturer gets the updated drivers out soon =(

---

Having just recently aquired both an Asus EEE 901 and a t-mobile 530 usb 3g broadband stick, work began on figuring out how to install.

On a stock Ubuntu (8.10) distribution, download and extract the 2 tar.gz files. These contain the v1.6 driver (as opposed to the v1.2 in Ubuntu). The README and other documentation aren't very clear, but this is basically how I did it. They might not work for you, so don't blame me if you end up summonning Cthulhu all over your hard drive, or anything else nasty happens.

I've only put them here so I can download them to my own EEE after I replace Xandros by ethernet (The eee WiFi works, but the Xandros WPA is broken) and obviously, I cant use the usb3g yet.... ;-)

# this disables the old option ZeroCD driver
echo "blacklist option" >> /etc/modprobe.d/blacklist
tar xzfmv hso-1.6.tar.gz
tar xzfmv udev.tar.gz
# now cd into each directory and
make install
reboot

I found that the system can crash if the stick is removed.
These are the files from the CD rom that came with the stick, I shall post the md5sums but they matched those available to download elsewhere, so you might as well use the CD if you have it.

Linux drivers:
hso-1.6.tar.gz: 30.05 KB
udev.tar.gz: 27.19 KB

EEE drivers (for Xandros?) - not required for Ubuntu
hso_connect.sh: 7.14 KB
hso-modules-2.6.21.4-eeepc_1.4-4+5_i386.deb: 30.13 KB
hso-udev_1.4-4_i386.deb: 21.03 KB

Edit: Sorry folks, looks like I'd messed up the urls, I've patched them now! =)

Labels: , , , , , ,

Thursday, 15 January 2009

Using the DCMD distributed shell to manage your cluster

'dcmd' from sourceforge.net provides a suite of commands that can be used, such as dssh. once you have given it a list of user@hostnames, it will send the same command to each one of your desired servers, and echo the response from each one.

for example, `dssh uptime` would be able to respond with all the uptimes of the linux / unix servers in your network or server farm. You could extend this to monitor for failed services, and attempt to restart them, for example...

This article explains everything:
http://www.samag.com/documents/s=8892/sam0310e/sam0310e.htm

Labels: , , , , ,

Wednesday, 14 January 2009

Create a new Samba user profile

Just a quick one tonight - how to quickly add a new user profile in (Ubuntu) linux and export their home profile directory as a Samba share. (You need to add them twice, once for linux and once for Samba.)

# useradd -d /home/foo -m foo
# passwd foo
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
# smbpasswd -a foo
New SMB password:
Retype new SMB password:
Added user foo.

a couple of things to note - you could use 'adduser' instead of useradd, but that will prompt you for the users info, which you may find you prefer (if in a non-scripted evironment)

also 'smbpasswd foo' might be required if, after running the command to add a new Samba user, it creates that user with a locked account - this should allow you to set the password and unlock them.

now connect to \\servername\username

Quick troubleshooting steps:
check all spellings in the /etc/samba/smb.conf
check spellings on windows client
read the comments in the smb.conf file! (especially the testparm hint!!)

Of course, I'm assuming you already know how to sudo apt-get install samba, configure your '[homes] directory exports in the smb.conf file, and restart samba '/etc/init.d/samba restart'. If not I got all I needed from:
google books (Hacking Ubuntu
By Neal Krawetz)

Labels: , , , , , , , ,

Tuesday, 13 January 2009

MySQL multimaster cluster

I've been looking at ways of clustering my MySQL servers, to provide failover capabilities and also to distribute the load coming in from a farm of web servers.

There are many ways of loadbalancing, from DNS round-robin through Apache ServerMap directives and so on. Once you get it sorted, and have implemented some sort of session system so a visitor can be serverd by any web node - how do you split the load on the database servers yet maintain consistency?

If you do not have machines powerful enough to look at the MySQL Cluster methods (from 5.0 upwards) then you will probably have looked into replication. But now you have another problem.... the write-to-master, read-from-slave scenario.

OK, it isn't a problem by itself. But if you run an application like wordpress, for example, which keeps the server address in the database - you now have the choice of modifying the application to be aware of more than one server (the master for writes, the slaves for reads) -or- modify each database so it has a different value.
This could be very tricky in a replicated slave environment, as changes there will be overwritten again by the master, and lost?!

So, if you don't want to have a major headache rewriting your application, or customising your databases, have a look at multimaster replication.

For my example, it involves running a mysql server instance on each webserver - and setting the ip address in the database to 127.0.0.1 instead of using a separate real ip each time. This may not be applicable to you, but it means for me, I can run any number of nodes, each with their own dedicated MySQL datasource, which will be identical across all the nodes. That's it!

http://www.onlamp.com/pub/a/onlamp/2006/04/20/advanced-mysql-replication.html

* again thanks to Matt for pointing out that replication traffic could spiral exponentially, with every additional node, if packets are sent to unicast addresses. I will try and configure each node to send updates to the multicast address, so each SQL SELECT, UPDATE, and so on is replicated to the network only once.

Labels: , , ,

Saturday, 10 January 2009

backup with rsync

OK, so we're starting to build our cluster now, which is all well and good, except we haven't discussed backups yet.

We are looking at having multiple physical servers share the traffic (Load balancing) and as they are online, will serve as hot replacements in the event of a machine or daemon process dying (failover). We could replicate our database, to reduce single point of failure - but this doesn't safeguard against a user 'DROP'ing our tables and data. We might use RAID, so that if one hard drive fails, the data is not lost - but again this doesnt stop an errant "rm -rf /*" (don't try this at home kids!)

So, we copy our data from one machine to another, from where it can be written to usb disk, tape drive, dvd, usb flash, whatever... and possibly exported back to the network as a read only share.

rsync -avz -e 'ssh -l username' /local/source/dir/ remotefoo:/destination/dir

I like to use the command like this, so I know I'm using ssh to copy, with compression. see the man rsync page for more options.

Labels: , , ,

Ubuntu raid - postinstallation

https://wiki.ubuntu.com/Raid has a simple script for creating a RAID array which you can adapt to your specific requirements - In my case I used it to add 2 shiny new 640GB drives to my Ubuntu machine after installing a copy of Ubuntu Desktop (I didnt bother reinstall the OS, I just added it as a data store. If my OS ever self-destructs, I can just boot off my USB drive and recover my data from there - although you may need to "sudo apt-get install mdadm".

Labels: , , ,

Friday, 9 January 2009

locate a package for Ubuntu

OK I admin Im being lazy and didnt want to follow on from my last post by downloading and installing UltraMonkey by hand. But since I was away from my Ubuntu machine, I couldnt fire up Synaptic to see if there was a Ubuntu release of the package I wanted.

As discovered by a colleage of mine, http://packages.ubuntu.com/ is the answer - allows you to search for what you want via the web instead. Cheers, Matt! =)

Of course the package you want might not be listed, so perhaps it is available from another repository? Then you might find you need to edit your sources.list file.

Labels: , , , ,

Make a redundant load balanced web or db server

OK, a clustered web server is not the same as a Beowulf (I never said it was).
However both projects can share practices such as diskless booting, BOOTP, DHCP (with fixed IPs for server MAC addresses) and so on. One thing I want to implement now is to have a service, such as mysqld or apache, on more than one node, with a load balancer (which also has a redundant node) serving requests on one incoming IP.

This can be acheived with UltraMonkey and heartbeat:

http://www.debianhelp.co.uk/ultramonkey.htm

Labels: , , , , , , , , , ,

Tuesday, 6 January 2009

Beowulf Project

Here's a snippet of Beowulf related links, retreived from http://web.archive.org/web/20030603220253/www.canonical.org/~kragen/beowulf-faq.txt

Whats a Beowulf? Its a supercomputer/cluster built from commodity PCs. These projects fueled my enthusiasm for computing - and form the basis of my approach to have a large system built of many smaller components working in symphony =)

My intention is to reproduce a Beowulf environment using more modern hardware and software, with as much failover tolerance as reasonably practical.

Thus you will find posts so far on Linux software RAID, clustering, redundancy and load balancing. Furture posts may discuss GFS / Lustre / Coda file systems, or simply using NFS exports with rsync. (Keep it simple?) Ultra Monkey and heartbeat might get mentioned.

Maybe I'll get around to putting an application on a machine, and can write about MPI & PVM vs Hadoop & MapReduce?

ps if the links dont work, try http://web.archive.org/ in front of the url.

Docs:

boilerplate software installation: [1999-05-13]
http://www.phy.duke.edu/brahma/#boilerplate

Beowulf HOWTO: [1999-05-13]
http://www.sci.usq.edu.au/staff/jacek/beowulf/BDP/HOWTO/

more boilerplate software installation: [1999-05-13]
http://www.lsc-group.phys.uwm.edu/~www/docs/beowulf/Proto-slave/autoinstall.html
http://www.lsc-group.phys.uwm.edu/~www/docs/beowulf/Slave/Slave_build.html

A little more information on "how to build a Beowulf": [1999-05-13]
http://beowulf.gsfc.nasa.gov/howto/howto.html

Yet another Beowulf Installation Howto: [1999-05-13]
http://lcdx00.wm.lc.ehu.es/~svet/beowulf/howto.html

Building a Beowulf System: [1999-05-13]
http://www.cacr.caltech.edu/research/beowulf/tutorial/beosoft/

How to Build a Beowulf-Class Cluster (slides): [1999-05-13]
http://smile.cpe.ku.ac.th/smile/beotalk/index.htm
(partly in Thai)

How to Build a Beowulf: An Electronic Book (slides): [1999-05-13]
http://smile.cpe.ku.ac.th/beowulf/index.html

Beowulf Installation and Administration HOWTO: [1999-05-13]
http://www.sci.usq.edu.au/staff/jacek/beowulf/BDP/BIAA-HOWTO/


Books:
How to build a Beowulf, from MIT Press [1999-05-13]

Labels: , , , ,

Monday, 5 January 2009

Scripted or manual install of linux from scratch

The page at http://ubuntuforums.org/showthread.php?t=293299 shows how to boot a Dapper/Edgy live CD and auto install an OS on a LUKS encrypted LVM.

I've posted it here because the script shows how to build the OS from the ground up - which is something I will try to perform on Intrepid.

ubuntu@ubuntu:~$ wget http://eli.criffield.net/auto_cr_inst-edgy
ubuntu@ubuntu:~$ chmod +x ./auto_cr_inst-edgy
ubuntu@ubuntu:~$ ./auto_cr_inst-edgy

If you want to learn even more about the basic building blocks of linux, try installing gentoo, or visit the Linux from scratch website

Labels: , ,

Howto make your server highly available

Setting up a highly available NFS server (5 pages)
http://www.howtoforge.com/high_availability_nfs_drbd_heartbeat

Mirror Your Web Site With rsync (2 pages)
http://www.howtoforge.com/mirroring_with_rsync

How To Set Up A Loadbalanced High-Availability Apache Cluster (4 pages)
http://www.howtoforge.com/high_availability_loadbalanced_apache_cluster

How To Set Up A Load-Balanced MySQL Cluster (8 pages)
http://www.howtoforge.com/loadbalanced_mysql_cluster_debian

How To Set Up Database Replication In MySQL (2 pages)
http://www.howtoforge.com/mysql_database_replication

Labels: , , , , , , , , , ,

Friday, 2 January 2009

Convert MYSQL to UTF8 character set on Ubuntu 8.10

I used the following methods to convert my server to UTF-8, using standard Ubuntu packages, eg mysql-server from synaptix, nothing had to be recompiled.

They may not work in your environment. Use them at your own risk, and BACKUP everything first. (Preferably test before using it on any production server.) I post them only as a personal memory aid. If you use the information here and your server bursts into flames, you accept responsibility for that.

PuTTY /SSH:
In PuTTY 0.60 you can change the option "Window > Translation > Received data assumed to be in which character set:" to 'UTF-8'. After this all received data will be interpreted as UTF-8 and displayed correctly. (For anyone ssh'ing to the mysql server and using the mysql> prompt from there!)

Bash:
try adding something like this to your ~/.bash_profile
echo -ne '\e%G\e[?47h\e%G\e[?47l'
There's no point trying to work in UTF-8 if your computer thinks you are speaking ISO 8859-1!! This should even fix vi, so AltGr+4 correctly gives the Euro '€' and AltGr+` [backtick] gives the broken pipe '¦', instead of displaying nothing, or something odd like an accented A¦ pair. (or a dot in bash). Of course, Windows notepad will choke on the extended character, but something like notepad++ will work to display it.

MySQL:
A) Edit /etc/mysql/my.cnf, locate the section that starts [mysqld], add the lines below to configure your server, and then restart mysql. This should switch the server to UTF-8 'mode'.

[mysqld]
# utf8
init-connect='SET NAMES utf8'
character-set-server=utf8
collation-server=utf8_general_ci

B) # /etc/init.d/mysqld restart

afterwards you should find the following all correctly updated to utf-8
> Variable Session value Global value
> character set client utf8 latin1
> character set connection utf8 latin1
> character set database latin1 latin1
> character set results utf8 latin1
> character set server latin1 latin1
> character set system utf8 utf8
> collation connection utf8_general_ci latin1_swedish_ci
> collation database latin1_swedish_ci latin1_swedish_ci
> collation server latin1_swedish_ci latin1_swedish_ci

This pinched from https://lists.ubuntu.com/archives/ubuntu-users/2006-February/067367.html because it was so very useful =)

C) Now follow this guide: http://www.nicknettleton.com/zine/php/php-utf-8-cheatsheet which you can also find copied below. Note that I was able to convert my latin1 encoded database (containing a wordpress blog) and all its tables (one-by-one) into utf8 using the below commands whilst the server was 'hot' - but take care! (and backups!!). Put db/table names with hyphens into backticks, e.g. `wp-blog`.

--------
To support worldwide languages, you need to use UTF-8 encoding for your web pages, emails and application, rather than ISO 8859-1 or another common western encoding, since these don't support characters used in languages such as Japanese and Chinese.

Happily, UTF-8 is transparent to the core Latin characterset, so you won't need to convert all your data to start using UTF-8. But there are a number of other issues to deal with. In particular, because UTF-8 is a multibyte encoding, meaning one character can be represented by more one or more bytes. This causes trouble for PHP, because the language parses and processes strings based on bytes, not characters, and makes mincemeat multibyte strings - for example, by splitting characters 'in half', bodging up regular expressions, and rendering email unreadable.

There are a number of great articles online about UTF-8 and how it works - Joel Spolski's comes to mind - but very few about how to actually get it working with PHP and iron out all the bugs. So, here to save you the time we put in, is a quick cheatsheet and info about a few common issues.

1. Update your database tables to use UTF-8

CREATE DATABASE db_name
CHARACTER SET utf8
DEFAULT CHARACTER SET utf8
COLLATE utf8_general_ci
DEFAULT COLLATE utf8_general_ci
;

ALTER DATABASE db_name
CHARACTER SET utf8
DEFAULT CHARACTER SET utf8
COLLATE utf8_general_ci
DEFAULT COLLATE utf8_general_ci
;

ALTER TABLE tbl_name
DEFAULT CHARACTER SET utf8
COLLATE utf8_general_ci
;

2. Install the mbstring extension for PHP

Windows: download the dll if it's not in your PHP extensions folder, and uncomment the relevant line in your php.ini file: extension=php_mbstring.dll
Linux: yum install php-mbstring

3. Configure mbstring

Do this in php.ini, httpd.conf or .htaccess. (Remember to prepend these with 'php_value ' in httpd.conf or .htaccess.)

mbstring.language = Neutral ; Set default language to Neutral(UTF-8) (default)
mbstring.internal_encoding = UTF-8 ; Set default internal encoding to UTF-8
mbstring.encoding_translation = On ; HTTP input encoding translation is enabled
mbstring.http_input = auto ; Set HTTP input character set dectection to auto
mbstring.http_output = UTF-8 ; Set HTTP output encoding to UTF-8
mbstring.detect_order = auto ; Set default character encoding detection order to auto
mbstring.substitute_character = none ; Do not print invalid characters
default_charset = UTF-8 ; Default character set for auto content type header

4. Deal with non-multibyte-safe functions in PHP

The fast-and-loose way to do this is with the following php configuration:

mbstring.func_overload = 7 ; All non-multibyte-safe functions are overloaded with the mbstring alternatives

But there are problems with this. php.net has a warning about this potentially affecting the whole server. And even if this isn't an issue for you, mbstring can make a mess of binary strings.

So, a better route is to search your application code for the following functions, and replace them with mbstring's 'slot-in' alternatives:

mail() -> mb_send_mail()
strlen() -> mb_strlen()
strpos() -> mb_strpos()
strrpos() -> mb_strrpos()
substr() -> mb_substr()
strtolower() -> mb_strtolower()
strtoupper() -> mb_strtoupper()
substr_count() -> mb_substr_count()
ereg() -> mb_ereg()
eregi() -> mb_eregi()
ereg_replace() -> mb_ereg_replace()
eregi_replace() -> mb_eregi_replace()
split() -> mb_split()

5. Sort out HTML entities

The htmlentities() function doesn't work automatically with multibyte strings. To save time, you'll want to create a wrapper function and use this instead:

/**
* Encodes HTML safely for UTF-8. Use instead of htmlentities.
*
* @param string $var
* @return string
*/
function html_encode($var)
{
return htmlentities($var, ENT_QUOTES, 'UTF-8') ;
}

6. Check content-type headers

Check through your code for any text-based content-type headers, and append the UTF-8 charset, so the browser knows what it's working with:

header('Content-type: text/html; charset=UTF-8') ;

You should also repeat this at the top of HTML pages:
(replace square brackets with angle ones!!)

[meta http-equiv="Content-type" value="text/html; charset=UTF-8" /]

7. Update email scripts

Email can be tricky. You'll need to update the content-type for any emails and text-based mime parts to use UTF-8 encoding. You'll also need to alter the way in which headers are encoded to use UTF-8. mbstring provides a function mb_encode_mimeheader() to handle this for you, but it does make a mess of address lists: you'll need to encoding the name and address parts seperately, then compile them into an address list.

Be sure to encode the subject and other headers too - Korean speakers will tend to put Korean text for the subject.

9. Check binary files and strings

Finally, double check any binary files and strings handled by PHP, particularly uploads, downloads and encryption. In some cases it may be necessary to revert to ASCII just before a download or processing a binary string.

--------

MySQL clients:
Check the options in your mysql clients - should be switched to UTF8 mode also, or you might get your results in latin1, or something wierd like that.


Perhaps use tips from http://tlug.dnho.net/?q=node/276 if you are trying to mysqldump & restore from latin1 to utf8.

PHP:
Use the above to set your PHP / front end into UTF8 (see mbstrings section).

Labels: , , , , ,