Martin Scholl (@zeit_geist) has started a new project based on the PBXT storage engine: EPBXT – Embedded PBXT! In his first blog he describes how you can easily build the latest version: Building Embedded PBXT from bzr.
The interesting thing about this project is that it exposes the “raw” power of the engine. Some basic performance tests show this really is the case.
At the lowest level, PBXT does not impose any format on the data stored in tables and indexes. When running as a MySQL storage engine it uses the MySQL native row and index formats. Theoretically it would be possible to expose this in an embedded API. The work Martin is doing goes in at this level. The wrapper around the engine determines the data types, data sizes, row and index format. Comparison operations for the data types are also supplied by the embedded code or user program.
This flexibility will make it possible for an application to store its own data very efficiently. As Martin suggested, it would also be possible to use Google’s protobuf’s for the row format. This would eliminate the need to use an ALTER TABLE for many types of changes to a table’s definition!
Of course, EPBXT is still a way from realizing this vision, and Martin has some very specific problems he wants to solve with the development. However, judging by his command of the code within such a short time, this is going to be a project to watch in the future!
First lets check if write cache is disabled for a zvol rpool/iscsi/vol1
milek@r600:~/progs# ./zvol_wce /dev/zvol/rdsk/rpool/iscsi/vol1Write Cache: disabled
Now lets issue 1000 writes
milek@r600:~/progs# ptime ./sync_file_create_loop /dev/zvol/rdsk/rpool/iscsi/vol1 1000real 12.013566363user 0.003144874sys 0.104826470
milek@r600:~/progs# ./zvol_wce /dev/zvol/rdsk/rpool/iscsi/vol1 1milek@r600:~/progs# ./zvol_wce /dev/zvol/rdsk/rpool/iscsi/vol1Write Cache: enabledmilek@r600:~/progs# ptime ./sync_file_create_loop /dev/zvol/rdsk/rpool/iscsi/vol1 1000real 0.239360231user 0.000949655sys 0.019019552
Worked fine.
The zvol_wce program is not idiot-proof and it doesn’t check if operation succeeded or not. You should be able to compile it by issuing: gcc -o zvol_wce zwol_wce.c
milek@r600:~/progs# cat zvol_wce.c/* Robert Milkowski http://milek.blogspot.com*/#include <unistd.h>#include <stropts.h>#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>#include <stropts.h>#include <sys/dkio.h>int main(int argc, char **argv){ char *path; int wce = 0; int rc; int fd; path = argv[1]; if ((fd = open(path, O_RDONLY|O_LARGEFILE)) == -1) exit(2); if (argc>2) { wce = atoi(argv[2]) ? 1 : 0; rc = ioctl(fd, DKIOCSETWCE, &wce); } else { rc = ioctl(fd, DKIOCGETWCE, &wce); printf("Write Cache: %s\n", wce ? "enabled" : "disabled"); } close(fd); exit(0);}
Brian Aker, a brilliant helpful duder, who I learn a lot from. Gives a great talk about what is NoSQL explained in a way for database guys. I warn you, there are some points in this video where you can’t hear Brian due to the audience “participation” but you should get the content.
PyCon 2010 in Atlanta was a blast as always. While I still have things fresh on my mind, here are my top 10 takeaways from the conference, in no particular order.
1) Alternative Python implementations are getting increased attention
It seemed to me that PyPy, Unladen Swallow, IronPython and Jython got much more buzz this year. Maybe this is also due to the announcement that Unladen Swallow will be merged into Python 3.x. I recommend you watch Holger Krekel’s talk on the topic of the diverse and healthy Python ecosystem, ‘The Ring of Python‘.
I was also glad to see that 2 core Jython developers, Frank Wierzbicki and Jim Baker, were hired by Sauce Labs.
2) Testing has gone mainstream
When Mark Shuttleworth mentions automated testing in his keynote as one of the most important ingredients of a sound software engineering process, you know that automated testing has arrived.
There were also no less than 6 testing-related talks, all very good, given by the usual suspects in the testing world, people like Ned Batchelder, Titus Brown, Holger Krekel and Michael Foord.
3) Packaging and packaging
I liked Antonio Rodriguez’s distinction between packaging with small ‘p’ (distutils, setuptools, distribute) and Packaging with big ‘P’ (the python.org web site). Both are very important. There was a lot of attention to packaging, and a great show of support for Tarek Ziade’s efforts in leading the way to improving the way of distributing Python packages. And I think Antonio was right in pointing out that the python.org site needs some redesign in terms of getting a more modern and streamlined look and feel.
4) I tweet, thus I exist
I came late to the Twitter party, barely a month ago. I was resistant at first because I considered tweeting a waste of time. I still think it has a strong tendency to shorten your attention span and break your focus, so I personally need to discipline myself in how I use it.
But Twitter is a great way to keep your finger on the pulse of topics that interest you — and at PyCon, if you didn’t tweet or at least read other people’s tweets, you were out of touch, out of the picture. Alex Gaynor and company did a great job with their PyCon Live Stream site, which was pretty much the dashboard of the conference.
5) The testing goat
Terry Peppers started a new meme during the TiP BoF: the testing goat. Read Terry’s post and also Titus’s post for more details on it, but suffice it to say it was a huge success. And speaking of the TiP Bof, it ballooned from last year to this one. I estimate around 120 attendees, so more than 10% of the people at the conference. Pizza provided by Disney (thanks to Paul Hildebrandt and Roy Turner), beer provided by Dr. Brown and friends, great lightning talks and unceasing heckling made this into one of the highlights of the conference.
6) Healthy ecosystem of Python web frameworks
Two or three years ago, all the buzz was about Django and maybe TurboGears. This year, a lot of presenters talked about other frameworks — Pylons in particular, but also tornado, CherryPy, restish. It does feel like Django is the granddaddy of them all, but it also feels to me like Pylons is being preferred by big name/big traffic web sites such as reddit. Tornado of course is a newcomer, and we’re using it very successfully at Evite. The presenter from Lolapps said they were also experimenting with it and were going to put it in production for some portions of their site.
7) Inspirational keynotes
I thought the keynotes were of much higher quality than in previous years. Mark Shuttleworth talked about ‘Cadence, quality and design’ (see the bitsource interview), while Antonio Rodriguez gave a very inspirational presentation on topics such as involving everybody in your company in coding (he knows, it sounds crazy…), about the strategic advantages of using Python, about putting more stable libraries into the stdlib (he mentioned httplib2, and I couldn’t agree more — we need that library in the stdlib!), and other stuff that you can see on his pycon 2010 page. You need to watch the video of his keynote though in order to appreciate the impact that it had (videos from PyCon are being made available as we speak on blip.tv).
One thing though — I am a big Ubuntu fan, I have it on both my laptop(s) and desktop, and yet I was pained to see that Mark Shuttleworth couldn’t use his slide deck because his laptop couldn’t properly display a dual screen when using the conference projector. I struggled to make it work myself before delivering my presentation. Ubuntu really needs to get better dual-screen configuration management software.
It’s all about the hallway discussions
For first comers to PyCon, or for people who intend to go next year, a word of advice: skip some of the presentations and instead join random people in hallway discussions, or for a beer at the bar. Trust me, you’ll learn more than in almost any presentation. And you’ll potentially make friends that you’ll recognize the next time you go to PyCon. I’ve done this for 6 years now, and it never fails to amaze me how easy it is to get into deep technical discussions over a mind-bending range of topics. Non-technical discussions are usually mind-bending at PyCon too
9) More talks of the advanced type please
I heard from many people (and Titus has been saying it for years) that they wished the talks were a bit more advanced. I realize PyCon needs to cater to all types of Python users, from beginning to intermediate to expert, but still the conference track could use a larger number of advanced, mind-exploding, challenging presentations (such as Raymond Hettinger’s talk). I understand though that next year there will be exactly such a track, dubbed ‘Extreme Python’, so I’m very much looking forward to it.
10) Top-notch organization
Finally, kudos to Van Lindberg, this year’s PyCon chair, and the rest of the organizers, for delivering an almost perfect experience to the more than 1,000 attendees. I though the food was great, the WiFi was better than usual, the sessions went almost always smoothly (minus projector issues), and there was a great fun and camaraderie in the air. That’s why PyCon is my (and many other people’s) favorite conference. Keep it up guys!
The mobile world is changing. It’s changing faster than the database world did back when MySQL was started and grew to be one of the most widely used database in the world.
Change brings turbulence and it’s difficult trying to see the big picture to find the major trends. It also means different philosophies of doing things clash and fight for survival.
There are two large debates at the moment around mobile. One is about open versus closed platforms and the other is around native applications versus web based. One of them is an important philosophical issue, the other one a more technical question of the best way to bring a good user experience to mobile.
The success of the iPhone and the App Store has meant a huge leap for both mobile applications and mobile web. But the iPhone platform is closed. The entire ecosystem is controlled by one company.
On the Internet it’s (somewhat) safe to say that the philosophy of open is winning. In fact, open is at the very heart of what the Internet is. As we are moving over to a world where Internet access predominantly is from a mobile device, do we want this to be an open world or a closed one controlled by one company?
This is a very important question for the future of mobile – and the Internet at large!
There’s certainly room for both native applications and web based ones in the mobile world. But is it of outmost importance that the platform that grows to be the dominant one for native applications is an open one.
This is why I am now using an Android phone (Hero from HTC).
That is also why I chose to invest in the Swedish cross platform and open source tool for mobile development: MoSync. In this company I see the same potential as the early days of MySQL.
If you are developing a mobile application that you want to work on practically all mobile phones, you should definitely check MoSync out!
Hey, check out this commission I just got from Shawn Dickinson!

I love this guy’s drawings, and since he is one of the best when it comes to funny cats, I had to get one of Luna and Babycat. This thing is huge in real life, too! Shawn draws all sorts of cool stuff, and he’s taking commissions right now, so take a look at his blog!
Also, Andy Ristaino nominated me for a Kreativ Blogger Award! Thank you Andy!
Here are the “rules” for the Kreativ Blogger Award:
1. You must thank the person who has given you the award.
2. Copy the logo and place it on your blog.
3. Link the person who has nominated you for the award.
4. Name 7 things about yourself that people might find interesting. (I skipped this one)
5. Nominate 7 other Kreativ Bloggers.
6. Post links to the 7 blogs you nominate.
7. Leave a comment on which of the blogs to let them know they have been nominated.
Like Andy said, it’s a bit of a chain-letter type thing, but since it’s a nice way to tell people you like their blogs and get links passed around, I say it’s a-ok! There are tons of amazing blogs that I check all the time, so it was hard to only choose 7, but here they are:
Rebecca Dart- She draws everything I love…kitties, barbarians, illustrations for scary country songs…and she’s truly amazing at it.
Pedro Vargas- One of my all time favorite blogs! I love everything this guy draws. And he animates too!
Kristen McCabe- So good…Kristen is super creative and funny, and her drawings are beautiful and always interesting.
Miss Withers- She’s doing these crazy fashion-y paper cut out dolls right now that I love…and her paintings are really cool and haunting!
Brie Hermanson- This girl is talented and funny and weird and awesome. I’m excited every time she updates!
Kristy Gordon- Kristy’s paintings are breathtaking and beautiful and really stick in your head for a long time and make you keep thinking about them. O__O
Seo Kim- Seo has a lovely style that’s really unique. She can draw amazingly well, and also has an eye for beautiful color and design.
Wow, I nominated almost all girls! Sorry guys!
I’ll probably be putting up some Mighty B stuff soon, unless I somehow manage to do some new art. I’ve been stuck in another terrible rut, and the only things I’ve drawn are for projects or work that I’m not allowed to show. Thanks everyone who left me feedback on my last post! I think I’ll post some more close up drawings from Skadi again at a later time!
Coming soon:
For those who are interested, the slides for my PyCon 2010 talk ‘Creating RESTful Web Services with restish‘ are online. Leave comments please if you attended the talk and want to start discussions on some of the topics I mentioned.
This is in 5.1.44. It is easy to make mistakes like this in a large and rapidly changing code base. Why not compile with -Werror to catch the problem?
double Item_cache_decimal::val_real()
{
DBUG_ASSERT(fixed);
double res;
if (!value_cached && !cache_value())
return NULL;
my_decimal2double(E_DEC_FATAL_ERROR, &decimal_value, &res);
return res;
}
At Fosdem 2010, already two weeks ago, I had the pleasure of hearing Geert van der Kelen explain the work he has been doing on connecting MySQL and Python. I don’t know anything about Python, but anybody that has the courage, perseverance and coding skills to create an implementation of the the MySQL wire protocol from scratch is a class-A programmer in my book. So, I encourage everyone that needs MySQL connectivity for Python programs to check out Geert’s brainchild, MySQL Connector/Python.
In relation to MySQL Connector/Python, I just read a post from Geert about how he uses the MySQL information_schema to generate some Python code. In this particular case, he needs the data from the COLLATIONS table to maintain a data structure that describes all collations supported by MySQL.
For some reasons that I cannot fathom, Geert needed to generate a structure for each possible collation, not just the ones for which the COLLATIONS table contains a row. To do this, he wrote a stored procedure that uses a cursor to loop through the COLLATIONS table. In the loop, he detects it whenever there’s a gap in the sequence of values from the ID column, and then starts a new loop to “fill the gaps”. For each iteration of the outer cursor loop, a piece of text is emitted that conforms to the syntax of a Python tuple describing the collation, and each iteration of the inner loop generates the text None, a Python built-in constant.
The final result of the procedure is a snippet of Python code shown below (abbreviated):
..("cp1251","cp1251_bulgarian_ci"), # 14("latin1","latin1_danish_ci"), # 15("hebrew","hebrew_general_ci"), # 16None,("tis620","tis620_thai_ci"), # 18("euckr","euckr_korean_ci"), # 19..In the final code, these lines are themselves used to form yet another tuple:
desc = ( None, ("big5","big5_chinese_ci"), # 1 ("latin2","latin2_czech_cs"), # 2 ("dec8","dec8_swedish_ci"), # 3 ("cp850","cp850_general_ci"), # 4..This is excellent use of the information schema! However, I am not too thrilled about using a stored routine for this. Enter my fosdem talk about refactoring stored routines.
In this case, performance is not really an issue, so I won’t play that card. But many people that do need well-performing stored procedures might start out like Geert and write a cursor loop, and perhaps do some looping inside that loop. One of the big take-aways in my presentation is to become aware of the ways that you can avoid a stored procedure. Geerts procedure is an excellent candidate to illustrate the point. As a bonus, I’m adding the code that is necessary to generate the entire snippet, not just the collection of tuples inside the outer pair of parenthesis.
So, here goes:
set group_concat_max_len := @@max_allowed_packet;select concat('desc = (', group_concat('\n ' , if( collations.id is null, 'None', concat('(', '"', character_set_name, '"', ',', '"', collation_name, '"', ')') ) , if(ids.id=255, '', ','), ' #', ids.id order by ids.id separator '' ), '\n)' )from (select (t0.id<<0) + (t1.id<<1) + (t2.id<<2) + (t3.id<<3) + (t4.id<<4) + (t5.id<<5) + (t6.id<<6) + (t7.id<<7) id from (select 0 id union all select 1) t0 , (select 0 id union all select 1) t1 , (select 0 id union all select 1) t2 , (select 0 id union all select 1) t3 , (select 0 id union all select 1) t4 , (select 0 id union all select 1) t5 , (select 0 id union all select 1) t6 , (select 0 id union all select 1) t7) idsleft join information_schema.collations on ids.id = collations.id;This query works first by generating 256 rows having id’s ranging from 0 to 255. (I think I recall Alexander Barkov mentioning that this is currently the maximum number of collations that MySQL supports – perhaps I am wronge there). This is done by cross-joining a simple derived table that generates two rows:
(select 0 id union all select 1)
So, one row that yields 0, and one that yields 1. By cross-joining 8 of these derived tables, we get 2 to the 8th power rows, which equals 256. In the SELECT-list, I use the left bitshift operator << to shift the original 0 and 1 0, 1, 2 and so on up to 7 positions. By then adding those values together, we fill up exactly one byte, and gain all possible values from 0 through 255:
(select (t0.id<<0) + (t1.id<<1) + (t2.id<<2) + (t3.id<<3) + (t4.id<<4) + (t5.id<<5) + (t6.id<<6) + (t7.id<<7) id from (select 0 id union all select 1) t0 , ... t1 , ... , (select 0 id union all select 1) t7) ids
Once we have this, the rest is straightforward – all we have to do now is use a LEFT JOIN to find any collations from the information_schema.COLLATIONS table in case the value of its ID column matches the value we computed with the bit-shifting jiggery-pokery. For the matching rows, we use CONCAT to generate a Python tuple describing the collation, and for the non-matching rows, we generate None:
if( collations.id is null, 'None', concat('(', '"', character_set_name, '"', ',', '"', collation_name, '"', ')'))The final touch is a GROUP_CONCAT that we use to bunch these up into a comma separated list that is used as entries for the outer tuple. As always, you should set the value of the group_concat_max_len server variable to a sufficiently high value to hold the contents of the generated string, and if you want to be on the safe side and not run the risk of getting a truncated result, you should use max_allowed_packet.
I have the honour of speaking at the MySQL user conference, april 12-15 later this year. There, I will be doing a related talk called Optimizing MySQL Stored Routines. In this talk, I will explain how stored routines impact performance, and provide some tips on how you can avoid them, but also on how to improve your stored procedure code in case you really do need them.
A customer recently showed up with the following problem:
With your guidelines [1] I am now able to send the MySQL error log to the syslog
and in particular to an external log server.
But I cannot see which user connects to the database in the error log.How can I achieve this?
During night when I slept my brain worked independently on this problem and in the morning he had prepared a possible solution for it.
What came out is the following:
The UDF can be taken from [4]. Be not confused by the version number. It just worked with MySQL 5.1.42. Load the UDF according to the article into the MySQL database. Follow the little example there and if it works lets continue to the next step.
The SQL query to form the MySQL error log string looks as follows:
mysql> SELECT CONCAT('[Security] User ', USER(), ' logged in.');And if executed with the function:
mysql> SELECT log_error(CONCAT('[Security] User ', USER(), ' logged in.'));it produces the following output to the MySQL error log file:
shell> tail -n 1 error.log 100215 17:50:16 [Security] User oli@localhost logged in.
And now make this permanent for every user which does not have SUPER privileges:
# # my.cnf # [mysqld] init_connect = 'SELECT log_error(CONCAT("[Security] User ", USER(), " logged in."));'restart the database and it should work now (it could also work with just SET GLOBAL init_connect=…).
Please consider the MySQL documentation [3] and be aware of the following:
“Note that the content of init_connect is not executed for users that have the SUPER privilege.”
Further I want to warn you that I have NOT tested the impact on stability and performance of this method! Please test it carefully yourself an let me know if you find something or also if it works smoothly for you.
This is part of the MySQL Auditing Package we are currently working on and we hope to finish it soon. If you are interested in this work please let us know and our MySQL consultants are happy to help you implementing your own MySQL auditing in your environment.
[1] MySQL reporting to syslog
[2] MySQL useful add-on collection using UDF
[3] MySQL documentation: init_connect
[4] UDF collection