Recently my wife approached me and told me that Gmail was warning her that she was using 95% of her (free) quota. This was a bit surprising, but my wife does a lot of photography, and sends lots of photos through the mail to various people. So I began helping her trying to find her large messages. It was then that I learned that Gmail provides no easy way to do this. You can't sort by size. You can search for specific attachments, for example .psd or .jpg, and that is what she ended up doing.
Surely I thought there must be an easier way. I thought that perhaps Gmail might have an API like their other products. A bit of searching turned up that the only API to Gmail is IMAP. I didn't know anything about IMAP, but I do know some Python. And sure enough, Python has a library for IMAP called imaplib. Glancing through imaplib I got the impression it was a very low-level library and I began to get a bit discouraged.
I continued doing some searching and I quickly found IMAPClient, a high-level and friendlier library for working with IMAP. This looked like it could work very well for me!
I started thinking about writing an application to find big messages in a Gmail account. The most obvious and natural way to flag large messages would be to slap a Gmail label on them. But could IMAPClient do this? It didn't look like it. It turns out that labels are part of a set of custom Gmail IMAP extensions, and IMAPClient didn't support them. Yet.
I contacted the author of IMAPClient, Menno Smits, and quickly learned he is a very friendly and encouraging guy. I decided to volunteer a patch, as this would give me a chance to learn something about IMAP. He readily agreed and I dove in.
The short version of the story is I learned a heck of a lot from reading the source to IMAPClient, and was able to contribute a patch for Gmail label support and even some tests!
So if you need a program to categorize your Gmail by message size, I hope weighmail will meet your needs. Please try it out and feel free to send me feedback and feature requests on the Bitbucket issue tracker.
I have used it maybe a half-dozen times on my Gmail account now. My Gmail account is only about 26% full and I have about 26,300 messages in my "All Mail" folder. Run times for weighmail have varied from six to 15 minutes when adding 3 label categories for size. I was (and am) kind of worried that Gmail may lock me out of my account for accessing it too heavily, but knock on wood it hasn't yet. Perhaps they rate limit the responses and that is why the run times vary so much.
In any event, I hope you find it useful. A big thank you to Menno Smits for his IMAPClient library, his advice, and working with me on the patch. Hooray for open source!