PL/pgSQL ON DELETE UPDATE

Posted on April 20th, 2011 in g33k life by aCaB

Apparently I’m the only one on earth which would greatly benefit from the construct in the title and this essential feature never got into SQL standards.
All right you don’t get it yet… So let’s start with a school case: the infamous blog in 15 minutes which is the only reason people use rails these days.
So we have a blog.
The blog has got some authors (don’t laugh it’s naive on purpose):

CREATE TABLE authors (
  author character varying(64) NOT NULL PRIMARY KEY
);

A blog has got a number of posts. Each post has got an owner:

CREATE TABLE posts (
  post_id serial NOT NULL,
  post_date timestamp NOT NULL DEFAULT NOW(),
  title character varying (128) NOT NULL,
  body text NOT NULL,
  owner character varying(64) NOT NULL,
  FOREIGN KEY(owner) references authors(author) ON UPDATE CASCADE ON DELETE NO ACTION
);

We’ve got a FK pointing from posts to authors so each post has got an author.
Additionally if the author changes her name, SQL will just adjust the ownership automatically. Now that’s cool.
The not so cool bit is that we cannot delete an author, without deleting all his posts first.
Now that’s a cool feature which guarantees integrity, but, for my own blog I want to be able to delete authors. And when I do that I want the ownership to be reassingned to some other author. Even a generic “I_Used_To_Post_Here” author…
SQL won’t let you do that automatically. Sure you can have it set to NULL on deletion. Which is sort of close to what I want, but not exactly the same.

Of course you can always handle the problem programmatically by creating a small function like:

CREATE FUNCTION delete_author(dropthis authors.author%TYPE, replacewith authors.author%TYPE)
  RETURNS void
  LANGUAGE plpgsql VOLATILE
  AS $$
BEGIN
  UPDATE posts SET owner = replacewith WHERE owner = dropthis;
  DELETE FROM authors WHERE author = dropthis;
END
$$;

And it works good too:

INSERT INTO authors VALUES ('aCaB');
INSERT INTO authors VALUES ('AuthorIsNoLongerHere');
INSERT INTO posts (title, body, owner) VALUES ('sample post', 'Lorem ipsum', 'aCaB');
brama=> SELECT * FROM posts;
 post_id |         post_date          |    title    |    body     | owner
---------+----------------------------+-------------+-------------+-------
       1 | 2011-04-20 12:59:26.969234 | sample post | Lorem ipsum | aCaB
SELECT delete_author('aCaB', 'AuthorIsNoLongerHere');
 delete_author
---------------

(1 row)
SELECT * FROM posts;
 post_id |         post_date          |    title    |    body     |        owner
---------+----------------------------+-------------+-------------+----------------------
       1 | 2011-04-20 12:59:26.969234 | sample post | Lorem ipsum | AuthorIsNoLongerHere
(1 row)
SELECT * FROM authors;
        author
----------------------
 AuthorIsNoLongerHere
(1 row)

However this doesn’t addess the very issue: automatically change ownership of objects on author deletion.
Suppose my blog is still in development, and it’s such a cool project that another guy joins me in the development.
He adds the option to create pages, besides blog posts. And pages too have got an owner.
However this new guy doesn’t know he’s got to add a line to the delete_author function to fix up the pages table too…
In the end, the approach described above doesn’t scale very well.

After a few days of banging my head on the wall I eventually asked on #postgresql and some guy totally pointed me in the right direction: why don’t you build dependencies at runtime gathering info from information_schema?

So here’s a better delete_author function, which doesn’t need to keep a static list of tables to be updated and automatically figures what to update whenever an author is deleted:

CREATE FUNCTION delete_author(dropthis authors.author%TYPE, replacewith authors.author%TYPE)
    RETURNS void
    LANGUAGE plpgsql SECURITY DEFINER
    AS $$
DECLARE
r record;
BEGIN
        LOCK TABLE authors IN ACCESS EXCLUSIVE MODE;
        PERFORM 1 FROM authors WHERE author = dropthis;
        IF NOT FOUND THEN
                RAISE EXCEPTION 'Author "%" must exist', dropthis;
        END IF;
        PERFORM 1 FROM authors WHERE author = replacewith;
        IF NOT FOUND THEN
                RAISE EXCEPTION 'Author "%" must exist', replacewith;
        END IF;

        FOR r IN
                SELECT
                        tbl.constraint_schema,
                        tbl.constraint_name,
                        tbl.table_name,
                        keys.column_name,
                        refs.unique_constraint_schema,
                        refs.unique_constraint_name
                FROM information_schema.table_constraints AS tbl
                LEFT JOIN information_schema.referential_constraints AS refs
                        ON tbl.constraint_catalog = refs.constraint_catalog
                        AND tbl.constraint_schema = refs.constraint_schema
                        AND tbl.constraint_name = refs.constraint_name
                LEFT JOIN information_schema.key_column_usage AS keys
                        ON tbl.constraint_catalog = keys.constraint_catalog
                        AND tbl.constraint_schema = keys.constraint_schema
                        AND tbl.constraint_name = keys.constraint_name
                        AND tbl.table_catalog = keys.table_catalog
                        AND tbl.table_schema = keys.table_schema
                        AND tbl.table_name = keys.table_name

                WHERE tbl.constraint_type = 'FOREIGN KEY' AND
                        refs.unique_constraint_schema = 'public' AND
                        refs.unique_constraint_name = 'authors_pkey'
        LOOP
                EXECUTE
                        'UPDATE '||quote_ident(r.table_name)||'
                        SET '||quote_ident(r.column_name)||'='||quote_literal(replacewith)||'
                        WHERE '||quote_ident(r.column_name)||'='||quote_literal(dropthis);
        END LOOP;
        DELETE FROM authors WHERE author = dropthis;
END
$$;

Admittedly it isn’t all that pretty. But sure it gets the job done.

Prevent exim backscatter by checking maildir quotas at smtp time

Posted on March 30th, 2011 in g33k life by aCaB

If you’re familiar with the Exim way of dealing with mails, quotas and ACLs, you’ve probably got a good idea of what the problem is.
If not, then a good place to start is this very old post on mail.exim.user.
In short, in Exim, quota checks happen at delivery (i.e. transport) time. If a mailbox is overquota, then the incoming message can’t be delivered. It is then queued, retried later (according to the retry rules) and eventually delivered as soon as some room is made (e.g. via downloading and deleting older messages).
However, if a mailbox is permanently overquota, or if the incoming message is way too fat to fit, then, when the retry time is over, exim bounces it to the sender.
Now, while this is good in theory, it’s actually a naive move in the internet of today where the vast majority of mail is spam and the vast majority of spam senders is forged: Exim is bouncing back spam to innocent senders. In a word: backscatter.

A better way to handle quotas is to enforce them at rcpt time and 4xx-reject the incoming message right away. This way no queuing happens and no bouncing takes place (well not directly from YOUR server at least).
And here the problems begin…

The first issue is that you can’t possibly know what the size of the message is before actually receiving it; and once you’ve received it, it’s too late to 4xx.
This means that, in order to enforce quotas at smtp time, you’ll have to let your lusers go slightly overquota before acting. This is simply a matter of setting “no_quota_is_inclusive” in your transport.

The second issue is that Exim does ACL verification as an unprivileged user which isn’t powered enough to look into the mail storage.
If you deliver to mboxes however the problem is easy enough to work around: just set the mail storage directories o+x.
In the appendfile transport you can e.g. set:
create_directory
directory_mode = 0771
mode = 660

In the ACL you can then simply ${stat} the mailbox and compare its size against the allowed quota. E.g., in the rcpt ACL you put something like:

defer
    domains = +local_domains
    verify = recipient
    condition = ${extract{size}{${stat:/var/mail/virtual_domains/${extract{mbox}{$address_data}{$value}{}}}}{$value}{0}}
    condition = ${if >= {${extract{size}{${stat:/var/mail/virtual_domains/${extract{mbox}{$address_data}{$value}{}}}}{$value}{-1}}}{USERQUOTA}}
    message = Mailbox over quota

where USERQUOTA is your quota macro of choice and $address_data['mbox'] is the path to the virtual mbox.

However, if you are delivering to maildir spools, then the whole thing becomes tricky as there’s not a single file to ${stat} but a possibly complex imap-generated hierarchy of directories and subdirectories.
The good news is, the Maildir++ format comes to the rescue…
It’s in fact just a matter of parsing the maildirsize file and compare the calculated size against the quota; if it’s lareger than what we allow, then we can reject with 4xx.
Now that’s cool, however, while parsing the maildirsize file could be possible with some ${perl} code or other expansion trickery, being able to actually read the file itself as an unprivileged user is a major obstacle (just ${stat}’ting it is not enough anymore).
Instead of doing the whole thing in Exim, however, I decided to write a separate daemon to be run as the “mailspool” user which could be queried via a unix socket to get the current quota. E.g.:

defer
    domains = +local_domains
    verify = recipient
    condition = ${if > {${readsocket{/path/to/socket}{/var/mail/virtual_domains/${extract{mbox}{$address_data}{$value}{nothere}}\n}{10s}{}{0}}} {USERQUOTA}}
    message = Mailbox over quota

So here’s the code.
And that’s the sample usage:

$ ./exim-quotad --help
Usage: ./exim-quotad

Options:
  -s, --socket <path>              creates socket in <path> (mandatory)
  -m, --socketmode <mode>          set socket mode to <mode> (default: 666)
  -u, --user <user>                change to user <user> (default: mail)
  -p, --pidfile <path>             writes pid to the file in <path> (default: no pidfile)
  -f, --foreground                 do not fork into background (default: daemonize)
  -d, --debug                      enable debug logging (default: debug is off)
  -l, --log <path> or --log syslog set logfile to <path> (default: logging is disabled)
  -t, --timeout <msecs>            sets socket timeout to <msecs> (default: 10000)
  -c, --max-children <num>         limits the number of forked children to <num> (default: 20)

  -h, --help                       print this help

Please be aware that the whole thing is untetested and not guaranteed to work for you. Also note that there are security concerns when running programs as the mail user.

SMTP hall of shame!

Posted on December 2nd, 2010 in g33k life by aCaB

Yesterday I got some pcaps from a mate who was trying to figue out why he had troubles receiving emails from all the Alice MTA’s (“Alice” being the broadband product form the largest Italian ISP).
He sent me pcaps, but from a quick look there was nothing wrong with it. The typical session looked like:
L: 220 Banner
R: EHLO stmp202.alice.it
L: 250-FEATURES...
R: MAIL FROM / RCPT TO / DATA
L: 354 Yeah, go on...
R: [some ~1000 bytes of data]
[silence]
L: 421 Hey, I don't have the whole day... Now GTFO!
R: disconnects

So WTF happened to the rest of the mail data??
Further investigation revealed that the problem is MTU related (lowering it to ~1460 makes the data flow again).
What happens is that the remote host starts sending out larget pakets (with DF set) in order to perform PMTU discovery. However at the same time it also drops or disregards any ICMP (fragmentation needed) notification.

I just couldn’t believe that, but, a quick peek at my own logs, revealed similar patterns. Also a quick survey with other BOFHs confirmed that, since about a week, all servers from “mamma telecom” are timing out in the DATA phase!

Now this kills me: not only they employ and rely on techies which would fail an interview for Mc Donald’s, but apparently they didn’t even notice the sudden raise in SMTP timeouts (and the growth of the mail queue – which tells me something about their monitoring technology) too!

Anyway, here’s the most idiotic hosts who timed out this week:

  1. smtpX.alice.it (35%)
  2. vsmtpX.tin.it (27%)
  3. smtpXweb.tin.it (8%)
  4. authsmtpX.register.it (4%)

lrn2internet !!!

BTW if you experience the same issues and really want to fix something which is NOT your problem, then a simple:
iptables -t mangle -A OUTPUT -p tcp --sport 25 --tcp-flags SYN,ACK,RST SYN,ACK -j TCPMSS --set-mss 1420
lets you receive mail from stupid hosts.

Move windows to a larger disk

Posted on September 29th, 2010 in Misc by aCaB

SCENARIO:

  • windows lives on /dev/sdb1 which is the old small partition
  • we move it to /dev/sdc1 wich is the new big partition

STEPS:

  1. clone the partition table and boot sector:
    dd if=/dev/sdb of=/dev/sdc bs=512 count=63
    then rescan drive partitions (partprobe, hdparm, echo “1″ > /sys/…/rescan, whatever)

  2. (alternative step) start with a fresh partition table:
    e.g. fdisk, cfdisk etc. make sure the type is 7 and the partition is bootable
    then install-mbr /dev/sdc
  3. clone the windows partition:
    ntfsclone --overwrite /dev/sdc1 /dev/sdb1
  4. resize the partition:
    ntfsresize /dev/sdc1
  5. mark the partition as dirty:
    ntfsfix /dev/sdc1
  6. boot the new disk and watch the scandisk: it will probably tell you it’s scanning D:
    Anyway note down the dos drive letter. When the disk is checked the box should reboot; power it off instead. If you are not quick, then just let windows boot. But do not log in. Again power the box down instead.
  7. boot ERD commander and mount the windows partition in it, then open the registry editor.
    Navigate to HKEY_LOCAL_MACHINE\SYSTEM\MountedDevices. Look for a value named “\DosDevices\C:” and remove it. Then look for the value named “\DosDevices\D:” (or whatever letter you noted down before) and rename it to “\DosDevices\C:“. More info here
  8. reboot without ERD

Native win32 exes in unix

Posted on September 11th, 2010 in g33k life by aCaB

One night, about one month ago, Sherpya on ICQ tells me he wants to run autochk.exe under linux as he’s pissed off of having to boot windoze just for running scandisk on an NTFS usb pen. Of course fsck.ntfs is not an option, he adds.
Since I have no use for NTFS (my only dual boot machine is XP on FAT32) I initially suggested that it would have been more fun to reverse engineer autochk instead. But soon I realized that a lot of work involved in loading alien code and emulating the api layer was quite similar to the PE emulation functionalities for ClamAV that I’ve been planning for ages and which I never really got started with.

Working closesly with Sherpya turned out to be a lot harder than I expected. We moved from very different perspectives and evolved different views and goals. We argued a lot along the way, mostly on side aspects and minor details (who said indentation? yes, that too…). However the technical challenges and the different inclinations and backgrounds helped us to get along somehow nicely.
I developed most of the low level stuff: the loader and all the assembly routines.
He’s instead focused on the build system and the ntdll emulation layer, which is by far the vast majority of the code. Being a portability freak, he’s also spent a lot of time on getting everything to work on FreeBSD, OSX and (this one really kills me) Win32 too.

The result is a minimalistic wine-like application which simulates the behaviour of the win32 native loader and provides an emulation layer around ntdll and other library routines. The emulation is only a fraction of that performed by wine and some parts of it are really crude hacks. However it’s good enough to run several different native (as in Subsystem = Native) executables.

Git repository can be found here

HOW IT WORKS
The loader reads the target executable then it allocates and maps into memory each of its sections.
Then it parses the import table and, for each required runtime library it attempts to dlopen a corresponding shared lib. Since native executables run inside a rather simple environment, the only runtime library they can rely upon is ntdll.dll (which gets dlopen’d as ntdll.dll.so).
Next, for each imported function a symbol lookup is performed and the corresponding import table entry is patched with the address of a small wrapper function. If the symbol is found the wrapper logs the call and jumps to the actual target. If the symbol is missing, then the wrapper complain about that and calls abort().
Then we fill in the PEB and TEB structures and make sure the latter is available via fs:[].
Next we allocate the stack and map KUSER_SHARED_DATA at 0x7ffe0000 where some static crt code expects it to be located.
Finally, we jump to the EP and let the alien code do its thing.

HOW TO COMPILE
You need GNU make (sometimes called gmake) and yasm.
Simply run make from the top source directory.

HOW TO RUN
First of all you need a native executable. If you want to run autochk.exe pick the one from W7. Threading is not properly implemnted and, for some weird reason, all previous versions rely on it.
Then simply execute: ./nloader autochk.exe disk.img
You should get something like:


Mapped section:    .text - RVA: 1000 -> 67000 - Raw: 400 -> 66200
Mapped section:    .data - RVA: 67000 -> 6b000 - Raw: 66200 -> 68e00
Mapped section:    .rsrc - RVA: 6b000 -> a3000 - Raw: 68e00 -> a0e00
Mapped section:   .reloc - RVA: a3000 -> a6000 - Raw: a0e00 -> a3200
Loading ntdll.dll.so and resolving imports
- ntdll.dll!_wcsicmp
[...]
- ntdll.dll!DbgPrintEx
Protection for section .text    set to r-x
Protection for section .data    set to rw-
Protection for section .rsrc    set to r--
Protection for section .reloc   set to r--
..

Checking file system on disk.img
The type of the file system is NTFS.
Volume label is sheghey.

One of your disks needs to be checked for consistency. You
may cancel the disk check, but it is strongly recommended
that you continue.
Windows will now check the disk.                         

CHKDSK is verifying files (stage 1 of 3)...

File verification completed.

CHKDSK is verifying indexes (stage 2 of 3)...
Deleting index entry System Volume Information in index $I30 of file 5.

Index verification completed.

CHKDSK is verifying security descriptors (stage 3 of 3)...

Security descriptor verification completed.

Windows has checked the file system and found no problems.

   2094079 KB total disk space.
    854716 KB in 4754 files.
      1808 KB in 302 indexes.
         0 KB in bad sectors.
     18379 KB in use by the system.
     12528 KB occupied by the log file.
   1219176 KB available on disk.

      4096 bytes in each allocation unit.
    523519 total allocation units on disk.
    304794 allocation units available on disk.

If you want to trace all api calls, you can enable debug logging via the NLOG environment variable.
Note that this can result in a lot of spam especially if you set NLOG=all.

COMPATIBILTY
We’ve tested a few native executable and the results are quite variable. Some work perfectly, some abort soon due to unimplemented APIs, some totally misbehave, likely due limitations in the emu layer.

Next Page »