Wednesday 31 March 2010

Short post about good internet stuff

I just discovered the Tiny Art Director:
http://tinyartdirector.blogspot.com/
A professional artist takes commissions from his small daughter and reports her response to the finished work. Best read chronologically, starting when she was 2 years old. My favourite of her comments is the crushing "You're doing art the wrong way".

I hated Mika at first, but "Relax" was quite good. And here is his new one, which is excellent. RedOne has done pretty well out of the whole Lady Gaga thing, and without having to go to the trouble of wearing odd clothes and leaping about a lot.


There's this new thing called mflow, which is a bit like twitter if twitter were nothing but music recommendations (which would obviously be an improvement). You can listen to whole versions of recommended songs, but only once; it's more about finding new stuff than a method of actually listening to your music. I quite like it but goodness knows whether it will last -- the internet is so tiring these days. I got an invite off of the popjustice website (where else). (Also I have invitations now if anyone wants one.) It reminded me of this wonderful and oddly sinister song. According to wikipedia it was the last ABBA song ever recorded, and Agnetha sang the lyrics with the lights off.

So I bought it. By my calculations this means I have given popjustice 20p. And I'm fine with that.

You can get 4OD on Youtube now. I'm trying to work out how to link a computer up to my parents' TV. I wish I were just a bit more geeky; I'm geeky enough to have most of the disadvantages without the advantage of being able to do Cool Stuff.

Also I would like to point out that the new Mr Kipling advert, in which a slightly disconsolate Mrs Kipling implies that his renowned ability with cakes is a way of coping with erectile dysfunction, is wrong. Just plain wrong.

Tuesday 30 March 2010

Organization

Paper is good stuff. Hurray for paper! On the other hand, it weighs quite a lot and takes up space. Also a big heap of paper is hard to manage in the long term, unless you are excellent at filing. Plus can I take all my files with me into a manuscript library? No, I cannot. But I can take this: [stock photo]

It's my little 500Gb hard drive! (If you want one you can get a commemorative "Michael Jackson is dead" version for less money than the plain variety.) I am quite pleased with how I'm getting on with making this change, so I thought I'd write a blog post about it, partly as an exercise in self-satisfaction, but also just in case anyone reading wants to make a similar transition. Of course the stuff I work on imposes a few odd constraints on how I work.

1. Managing references.
A long time ago I got fed up of retyping the same bibliographical references, and made a file called "Megabibliography" which was just an amalgamation of all the bibliographies I had ever written, for reference purposes. They were in variable styles, of course, but it worked as a quickish way of getting references. I switched to Endnote in 2003 or 4 when the whole thing got out of hand, and I had a several-thousand-item bibliography I had to manage for work, and also needed to generate an annual bibliography in a very odd style. (Importing all the data gave me RSI.)
The things which Endnote does which I really need are: ability to tag documents with keywords (e.g. "Calendars", "ASE 2006", or "ASE check", etc); easily customisable export styles, allowing me to produce my own style sheets for obscure journals, and XML material for easy import into other things; and it quickly generates citations in a way which I can edit. The last is vital: most reference software is designed for scientists, and is based on the idea that your computer does all the work for you. But in the humanities a reference manager is essentially a time-saving device. It won't be able to cope with certain oddities, like nineteenth-century German monographs which are also issues of a journal, etc, and it's important that you can easily edit what comes out. Beware of Zotero and Mendeley. They may be OK if you're just starting out, and they are free, but they do things automatically for you which you don't necessarily want done, and are very inflexible. Also I am intensely suspicious of the cloud. It's just not true that I am rarely offline; sometimes I'm on a train, or in Corpus Christi College, where there is no wifi at all. Occasionally broadband connections go down. And I'm not a fan of their social-networking-style features. The other day Mendeley told me that the most popular author in the Humanities is Foucault, and suggested that I might like to read Foucault. As many as 15 Mendeley users have Foucault in their bibliographies, apparently. Stupid software. That sort of thing is a bit like the Ask the Audience option on Who Wants To Be A Millionaire, in that it's most useful for the questions about soap operas. I can quite happily believe I'll discover a fun YouTube clip or a good TV show through Web 2.0/Cloud-style stuff; not so sure I'm going to find help in my work on late Anglo-Saxon liturgy.
Be cautious also about the option to import references automatically into your library. One of those applications' website says something like "Avoid typing errors by importing references from web databases" but what it means is "Propagate other people's errors by importing references from web databases". Library catalogues are packed with mistakes, because a lot of libraries mistakenly treat cataloguing as a data-entry task and pay casual staff not much money to do it. Import material as a time saver, but you'll need to edit it before moving on.

2. Recycling my "Photocopied Papers" folders.
All my folders of photocopied papers have now been recycled into rat bedding or taken away by the council, and I gave the lever arch files to my dad. The key to this was a small sheet-feeding scanner, and DVD box sets. The DVD box sets are vital because this is dull work in the same way that feeding CDs into iTunes is dull. I recommend the complete Seinfeld; it's light-hearted and you've probably seen them all before anyway. I bought a little scanner which does one sheet at a time quite quickly and feeds it through for you. Plus points: it's cheap; it's very portable; it came with excellent software. It was probably my favourite thing about 2009. (It wasn't a great year.) But if I were seriously rich I would have spent about 300 quid on this superfast non-portable sheet-feeding scanner which can do double-sided. Really it's the software which made the whole thing possible. Feeding paper into a scanner is fine because you don't need to think about it. Then, once I had done a whole article, I simply highlighted all the images from it in Presto PageManager, which came with the scanner, right-clicked to stack them, and then dragged them onto a PDF icon to save as a multi-page PDF. Whilst doing this I was beginning to feed the next article through (unless I had got too distracted by the antics of that Kramer). So if you have a scanner but no software then look into something which will painlessly turn images into pdfs for you.

3. Metamanuscripts.
I work on medieval manuscripts. I pay a lot of attention to form as well as content. I have not yet managed to find a way of taking notes on a computer which is better for my work than paper and pencil, because I need sometimes to make drawings of odd shapes of letters, or of decoration. I'm not an artist by a long long way but I can copy things OKish. I've tried using electronic pen input, with a graphics tablet, but it's not as controllable as a real pencil in terms of shading and such like; and some libraries let you take your own photographs, but not all do, and besides sometimes you need to copy something in order to look at it properly, or to add notes on things you can't expect to come out in a photograph, like words scratched into gold on initials. So I have a lot of manuscripts about manuscripts. I digitised them in the same way as the articles of step 2, but I haven't recycled them just in case, since they are unique; the originals sit in a couple of large boxes full of suspension files. I have added the PDFs to my MS-image files, and now if I'm sitting in a library looking at a manuscript and it reminds me of another manuscript I can immediately call up all my notes on it and any images I have just like that. This is an immensely useful tool for me. I continue to take notes with a pencil on paper, and then scan them in when I get home.
NB The graphics tablet might not have worked for me for manuscript notes, but it was great for the RSI caused by step 1. Hurray! Also these gloves, without which I cannot now type; I recommend them heartily if you have back-of-the-hand style RSI like me, rather than bad wrists.

4. Photocopying costs 10p a sheet at CUL, and 20p at the BL (where you are not allowed to copy more than a single page at a time, e.g. no double-page spreads).
I ended up in a situation where I would carefully (and expensively) photocopy an article, take it home, scan it, and then put the paper straight into the recycling. My current project is an attempt to cut out the middle phase of this, thus saving money, time, and the planet. I have acquired a very cheap, portable, USB-powered flat-bed scanner, and I am experimenting with scanning things from books. For example I have some multi-author books with one or two seriously useful articles in, and I am trying out digitising these so that I can take them with me to libraries. So far it's working well. So if possible in future I will borrow books or journals and copy the article I want straight on to my computer, and the portable scanner can live in my suitcase when I travel. Now I know that the UL photocopiers will now, theoretically, scan things for you and e-mail them to you for 8p (!) an image, but when I tried it it charged me for a whole article and only sent me the first page. Meaning I had to go back to the library and find the journal again, etc. I might give this another go sometime though, for things I can't borrow.

5. Image and text
I didn't originally OCR my scans. This was because at work I used to have access to the full copy of Adobe Acrobat, which would OCR pdfs for you, but very very slowly, without much accuracy, and with a large augmentation to the storage size of the file, so I wrote it off as a concept. However, I have recently got hold of ABBY Finereader 10 and it is very fine indeed. It even OCRs in Latin. If I run pdfs through it which I scanned in from photocopies of open books, so that they have two pages side by side on a landscape sheet of A4, it automatically splits these into two separate pages and rotates them before doing the OCR. It's also possible to scan straight into Finereader; it does a lot of image processing though, and I think it's quicker to give it pdfs which it doesn't feel the need to tidy up so much.

6. Managing PDFs.
The point of having a computer-read text is to be able to search it. Of course the OCR isn't perfect (though it's very good) and I'm not wasting time making it perfect, so searches have to be reasonably fuzzy. It's at this point that I'm still experimenting. I can set Windows' own indexing facility to include pdfs (using the pdf iFilter which comes with Windows 7 at least) and then I can look for, say, "Harley" in all pdfs in a folder, and it returns a list of matches. However, it doesn't give me context -- it would be useful to have the several words before and after, especially since I am probably most interested in the number that follows Harley. Or I can use the "search all in folder" option in Adobe Reader 9, which I think will be useful. Ideally I'd like to produce a concordance file, but concordance software seems on the whole to need .txt files. If I could find a good piece of software that allowed me to insert my own index tags in pdfs and then produce an index across many pdf files I would have a go at making manuscript indexes for myself. These would be very very useful. I have manually indexed important articles in the past, and it's tedious work, but wonderful to have later.

7. Backing up.
I had a hard drive fail only the other day, but my complex backup system meant that I didn't lose any data. Get at least one large external hard drive and set things backing up when you go to bed. I use BounceBack to do system backups and Allway Sync to do basic synchronisations of data. (BounceBack came free on an external hard drive I bought; it's sometimes worth paying attention to the software that comes bundled with hardware.) Allway Sync lets you choose whether or not things deleted from the source folder get deleted from the backup folder, so I back my external hard drive up twice, once in each way to different hard drives. If I accidentally delete something I can retrieve it from the comprehensive backup, but if I were to lose the whole drive I would get my new copy from the more accurate backup. Of course if I were really paranoid I would have an off-site backup, but I have too much data for an online one.

8. What I'd like to try.
I think I'm going to try taking photos of books and see whether Finereader can OCR them for me. The problem will be camera shake. But ABBYY have a special application for this, Fotoreader, which is what gave me the idea it might be worth trying.

9. Maybe in about 10 years' time.
My graphics tablet has amazing handwriting recognition. I found I could write more or less normally and it would OCR what I had written. So the next thing will be something I can run my handwritten MS notes through which will OCR them and make them searchable. Not possible now I think, but technology might well head that way. So at present I can't search my manuscript notes, which would be a useful thing to be able to do. I have two workarounds for this: the first is Onenote. I switched to Onenote a couple of years ago and now I absolutely depend on it for all my work and home organisation and everything, and will never be able to leave it. It replaces those lever arch files I used to have which organised notes for a particular project. When I find something in a manuscript which I don't need at the moment but think I might want to find again one day I make a note in Onenote. For example the other day I saw an interesting form of quire signature which reminded me of something in a Trinity MS, and because I made a quick note in Onenote I can find it immediately by searching on "quire" and tell you now that it was in an 8th-century Merovingian manuscript in the Vatican. I'm not going to go into the excellences of Onenote now; but I find it very useful for many things, and it's now the core of how I work. My second workaround is old school: I have a little address book, the sort with alphabetical tabs but no actual text printed in it, and I copy interesting letterforms into it. One day I'll work out how to migrate it to my computer, but at the moment I just carry that around. It's useful to consult but easy to leave it home accidentally.

10. If I were a proper geek.
There are instructables and such online for making your own book scanner, from complex ones to more basic but clever designs. It would be a lot quicker, and a bit better for the book, than using a flat-bed scanner or photocopier, and quicker and more accurate than the basic camera method. But I don't have the oomph to make one, and even if I did it would be a bit expensive. You can get a proper professional set up for about 1500 dollars, plus the cost of a good camera, and I think more and more libraries are investing in them. Then they digitise books which are in demand for undergraduates. I have no idea what the copyright implications of that are. (For myself I'm not digitising anything I'm not allowed to photocopy, so I don't think it's an issue for me.) I saw the scanner at Stanford when I went out, one of the ones which works for google books, but it's huge, the size of a room, and we've all seen how patchy its results are.

Of course some parts of this have taken me quite a bit of time, but I am prone to intense but short-lived enthusiasms, and got most of it done while watching TV I was probably going to watch anyway. Now it seems easier to me to scan something in than to file it. And the wierd thing is that my pdf library is only just over 9Gb in size, e.g. tiny. (My manuscript notes library is 155Gb but then it has a lot of images in.)

Wednesday 24 March 2010

Uncanny voices

This is my blog, and I'll fill it with embedded YouTube videos if I want to. In more news from the internet of the recent past, here is a recentish Hot Chip video. Now, I was feeling a bit, as it were, "over" Hot Chip, but I've decided I like them again on the strength of this. Well done Peter Serafinowicz! It does exactly what a video should do: make you listen to the song enough that you get to like it; and draw out some quality of the song that you would have liked anyway. (In this case, the uncanniness of the man's voice.) It's a bit like the way that good literary criticism leaves you with a greater appreciation of the work itself. I never got Antony and Cleopatra when we did it for A-level until I read an excellent essay saying it was all about gaps in communication, but ever since then I've actually been quite fond of it as a play. (I think that's why I don't like a lot of Theory-style criticism much, because it adds too much. It's like if someone said, Look at that tree! and then pointed out to you some of the beautiful things about the tree, you could look at the tree and enjoy those things which you might not have noticed immediately; whereas lots of modern criticism is more about painting the tree in bright colours and hanging exciting things off its branches, and yes the result looks great and very interesting, but I'm more interested in the tree itself, and frankly I think you're a bit wierd.)


And in news from my ipod's shuffle facility, here is one of my favourite Handel arias. My version is sung by Sarah Connolly, who is truly brilliant, and takes it a bit more slowly, which I think suits it better; but there's nothing so uncanny as a counter-tenor's voice. (Myself I'm still croaking from my endless sore throat *sigh of self-pity*.)

Tuesday 23 March 2010

It continues

New annoyances: thousand-pound council tax bill for a place I no longer live, necessitating lots of repeat phone calls to people who are terribly nice but say everything's fixed when it turns out later that it isn't; I can't find my copy of Bishop's English Caroline Minuscule (once owned by Clemoes); I still have a horrible sore throat and the only pastilles which help are foul; and frankly being involved in the castration of sweet little alpacas is revolting. Darcy the alpaca is very annoyed with us, and is taking it out on Glenfiddich, who is a bit more phlegmatic about the whole thing.

Continuing my catch-up with the internet of a fortnight ago, here is an excellent Sophie Ellis Bextor song which was new that recently:


And this excellent drawing blog has a guide to making perfect sauerkraut. Don't forget to put the wine in its place.

Sunday 21 March 2010

Stuff

Well to add to the pile of life aggravations I now have a hard drive failure, a bad cold that's still going strong after three weeks, and failing to get a job I really wanted. Also an impending operation on my lovely rat Audrey. And tomorrow I get to help a vet castrate alpacas, which I'm not sure is going to be fun. I'm so far behind with my google reader subscriptions that it has just said "1000+" for ages. But while making vague attempts to break the 1000 barrier I was quite pleased to see this explanation of how Mega Shark can take out an aeroplane. Also I hadn't seen the clip before, and it is great:

I suspect that Mega Shark vs Giant Octopus is one of those films that's better in clips.

Another good thing:

But otherwise this season of Skins has been rubbish. It used to be funny; the bits with Josie Long in were fantastic. Also it was quite clever. But the last two series have just been full of Issues.

And I really like this xkcd about the Collatz Conjecture, which I hadn't heard of before. But it disconcerts me that it's called a conjecture; at first glance it seems like the sort of thing they used to make one try to prove for the British Maths Olympiad. I would get out a pencil and try except that I've forgotten all that stuff now, and I was never one of those who were brilliant at it.
Edit: this wikipedia entry on it shows that it really is a conjecture. It's always odd when things which are really easy to state aren't provable.

Friday 12 March 2010

Smashing things up

Lots of life has been happening to me all at once recently -- not anything excessively dramatic, just lots of travel and deadlines and problems, all topped off with the worst cold I have had in ages and a large sprinkling of minor aggravations like the bank losing a cheque I paid in, etc. I have over 700 things to read on my google reader, and have three weeks' worth of Guardian review sections to catch up on, and I keep not being around for Glee on Monday evenings. Once I have shaken off the lurgy and got back into the swing of things maybe I will post something terribly reasoned and insightful on this blog. In the meantime here is another video by OK Go, the people who make excellent videos. (They came up with the treadmill dance routine thing which one sees ripped off all over the place now.)