CoralNet
Sunday, December 16, 2012
Image upload, and general updates
The majority of my time on CoralNet for the first half of the fall - as well as most of the summer - was spent on the image upload page.
Fixing a bug
It started when I decided to address a to-do item that had been on our list for a long time. It seemed that the image uploader did not check for corrupted images. And uploading corrupted images could lead to buggy behavior if you later tried to work with such images on CoralNet; for example, you might see a blank space on a page that was supposed to show a thumbnail of that image.
It had been quite a while since I touched the image upload code. When I took a look, I found a code comment marked "TODO" saying something to the effect of: On multi-image upload, only the first image is checked for corruption. Need to fix.
Why did I knowingly code it like that? I don't know, but when I started modifying the code to fix it, I realized that the code for processing a multi-image upload was really unwieldy and hard to wrap my head around. I eventually fixed the code to check for image corruption of every image, but I really didn't like the code as it was.
I figured that while I was already messing with the image upload code, I might as well tackle another related point that had been on the to-do list for a while: make the image upload form use Ajax to show upload progress and allow partially complete uploads.
What does that mean? Well, at the time, when the user selected multiple images to upload and clicked submit, the browser would start sending the upload request, and would just sit there sending until it finished. Even if the upload took minutes or hours, there was no indication about how far along the upload was. Moreover, partial uploads were not possible: if the user's internet connection dropped out in the middle of the upload, then all of the upload progress would be lost.
With Ajax image upload, on the other hand, upload requests could be made asynchronously and separately for each image. We send the first image, and then whenever that finishes, we get a status reply back, and then we send the second image. This gives us two things: the ability to have an upload progress indicator (at least on a "3 out of 10 images" granularity), and the ability to save uploaded images individually, which allows partial uploads and reduces my code's complexity.
Seemed simple enough, although just by rewriting the behavior to go from "process all images at once" to "process images one by one", I ended up sifting through pretty much all of the upload post-processing code: checking for non-corrupt images, setting image metadata fields, generating random points, and so on. And along the way, I wanted to improve this little thing too, and tweak that little thing...
To cut a long story short, I spent a long time tweaking the image upload process. Not all of those tweaks will be an interesting read if I just ramble on about them. So why don't we just see how the page works now? I feel like I never got around to explaning all of the things it can do, even before I overhauled the page recently.
The image upload page
Click "Choose Files" (or whatever the button looks like in your browser) and choose some files to upload from your computer.
You'll see a summary of the files you've chosen. (Yes, I happen to have test images of 1 KB; typical images are more on the order of a few MBs, though.) Click "Start upload" to start the upload!
Here's the new stuff. You can see the upload in action: here, the first image has been uploaded, the second is currently uploading, and the third is waiting to be uploaded. A link to the first image is already available, and you can click that link to see the image in your source.
You can also click "Abort upload" to stop the upload if you wish.
And there's the upload when it's complete.
Now, there's one thing about the upload page that's been tripping up a lot of users: specifying metadata. Specifying metadata in the image filename is great for power users: if you already have all of the filenames organized in the exact right format on your computer, uploading is a breeze. However, this way of specifying metadata is not easy to use for the average user; it caters toward a programmer's way of thinking. Many users will understandably make mistakes when trying to upload their images.
As you can see, there are limits to how helpful one can be with a simple filename parser. Sometimes we just can't tell what kind of mistake the user made; maybe it was just a complete misunderstanding of the format.
One thing we are currently working on is re-thinking the way metadata is specified. However, for now, we only have the filename method. At least know that we'll tell you if you've made a mistake. This includes trying to upload an image that is considered a duplicate (has the same location values and year as an image already in the source):
The green link links to the existing image.
You can choose to upload the image anyway, even if you know it is a duplicate (your newly uploaded image will replace the duplicate image in the source). To do so, choose, "Replace" for the option "Skip or replace duplicate images" near the top of the page.
How about those corrupt images mentioned earlier? Well, do know that these can't generally be detected until you actually upload the image. But once you try to upload the image, we'll detect corruption, all right.
Note how this doesn't affect the status of other images that you upload along with it.
Now, you might have noticed that part at the middle of the page reading "Points and annotations (optional)". That functionality used to be a page called Import Annotations, which frankly, probably no one ever used or noticed. Now, it's merged into the image upload page.
So what is it? Let's check the checkbox to see.
You can choose a .txt file that contains some point and annotation data. Once you do...
The .txt file is double-checked, and the table of images (at the bottom) is matched with the points and annotations in the file. When you then click "Start upload", it'll upload those points and annotations along with the image. Normally, we generate a number of random points for newly uploaded images, but if you specify the points yourself like this, then we won't generate random points. The main use of this is when you have some images that were annotated offline with a program like Coral Point Count.
Or maybe you want to just specify the random points yourself, but not the annotations; you want to annotate those points using CoralNet. You can do that; try the dropdown next to "Data" and select "Points only" to only import points:
So, what does the points/annotations text file look like? Something like this:
cool; 001; 2011; 200; 300; Acrop
cool; 001; 2011; 50; 250; Blast
cool; 001; 2011; 10; 10; Porit
cool; 001; 2012; 1; 1; CCA
cool; 001; 2012; 400; 400; Turf
Here, each line of the text file contains information for a point of an image:
- image's location value 1
- image's location value 2
- image's year
- point's row location in the image
- point's column location in the image
- annotation's label code (if uploading annotations)
Also note that one text file can specify the points/annotations for multiple images. In that case, when you upload the text file and the multiple images together, all of those images will get the specified points/annotations added for them. If you do it right, you could package all of your Coral Point Count work in a single upload!
What could possibly go wrong? Well:
Things like that. Again, this is a powerful tool, but it can be tricky to use. Still, we'll try to point you in the right direction to fix any mistakes you might have made.
Note that once you've fixed your mistake in the annotation file, you can just click "Re-process file" to see if you've got it right this time. No need to re-select the file using "Choose file".
If you've used Coral Point Count to annotate image, you might be thinking: "My annotation data is all in Coral Point Count files or Excel files. Why can't you just let me upload those?" Well, to be honest, that's one thing that's been on our to-do list for a while; so we'll do our best to find the time to do it. It'll certainly make the point and annotation upload easier to use!
That's about it for the image upload and annotation import page. Hopefully you now have a better idea of what this page has to offer. Programmers like me often have this tendency of coding without explaining what they've done. Well, I guess here's my attempt at trying to fight that tendency.
Unit tests
I'll go on and finish explaining what I've done recently. The next thing I did was write a bunch of unit tests for the upload and import code.
These unit tests, well, test the upload and import functionality to make sure things work as expected. The unit tests can be run and re-run automatically. That way, if we change the code again down the road (and we will), we can re-run the unit tests to see if we accidentally broke any previous functionality. Overall, it means we'll be able to better detect bugs caused by future changes, and that means you'll see less bugs on the site.
Some examples of unit tests:
- Upload a single image, then make sure that image had random points generated for it.
- Upload a corrupt image and make sure the image corruption is detected and handled properly.
- Upload a duplicate image with the "Replace" option chosen, and check that the new image actually did replace the old one.
There are some things that the unit tests cannot check - for example, they can't check that a particular button is of a reasonable size to click, and they can't check that the page hasn't accidentally turned completely green. Testing the graphical interface is harder than testing functionality. Meanwhile, there are some things that I didn't write tests for simply because I didn't have the time to write tests that cover every possible situation. Still, though, a lot of things are tested. And if nothing else:
Seeing 36 out of 36 tests pass feels pretty good.
What's next
When I get time to work on CoralNet these days, I'm mainly working on the annotation tool, which is the page where you annotate images. We've gotten comments that the annotation tool's usability is at least comparable to Coral Point Count, so I think we have the right idea. However, there is no doubt that it can be improved.
The main thing I am working on is the general layout of the annotation tool. The general goals of a good layout are to make the tools easy to access for advanced users who want to work quickly, as well as easy to find the first time for novice users.
One key idea I am working on is sizing the annotation tool to fit completely inside the browser window. That way, you wouldn't have to use the browser window's scrollbars to look for something on the annotation tool. At the same time, though, sizing everything to the browser window can make some elements too tiny if we're not careful. The layout will have to make smart use of real estate, and should adapt to smaller or larger browser window sizes when possible.
A related idea is to let you pick one of a few different annotation tool layouts, whichever best fits your taste or monitor's aspect ratio. Should we provide a layout that's optimized for a huge wide-screen monitor? I don't see why not! At the same time, traditional 4-by-3 monitors also need a great layout that will make for easy and efficient use.
Well, that's all I have to say about my CoralNet work for now. In the meantime, our two newest team members, Jeff and Andrew, have been working on some interesting new features as well. We'll do our best to let all of our users know about the upcoming features as they happen.
Thursday, September 15, 2011
Screenshots: Annotation points, labelsets, and forms
Here's another batch of screenshots.
There's no annotation tool yet (i.e. tool that you can actually annotate images with), but we can overlay annotation points on an image:
Viewing a Source's labelset. There's one labelset per source, and one source per labelset. We originally thought we'd let multiple sources share a labelset, but that wouldn't work too well if you wanted to make any changes to the labelset.
The New Source form's look has been updated. This was the result of days of tinkering with CSS and trying to learn its... idiosyncrasies.
Bottom part of the form. The required parameters for Point Generation will depend on the Point generation type selected.
There's no annotation tool yet (i.e. tool that you can actually annotate images with), but we can overlay annotation points on an image:
Image upload form. It allows you to retrieve metadata (location keys and photo date) from your files' filenames.
You'll know if one of your files' filenames doesn't match the required filename format ("Filename error"). You'll also know if your Source already has an image with the exact same location keys and year ("Duplicate found"). When there are duplicate images, you can choose to skip uploading them, or upload and replace the old images.Creating or editing a labelset. Each table row represents one label; click the row to select or unselect it, like a checkbox. If you're editing a labelset, then the changes from your labelset's current state will be shown on the rightmost column ("added" or "deleted"). And if you're editing a labelset, you can't delete labels that already have annotations in your Source. These labels are highlighted in gray (here, Pocillopora and Porites) and you can't change their status by clicking the row.
Creating a new label. This label form shows when you click "Add a new label that's not in the list" in the previous screenshot. Labels are shared site-wide. Eventually, we probably want some simple screening process for new label submissions.Viewing a Source's labelset. There's one labelset per source, and one source per labelset. We originally thought we'd let multiple sources share a labelset, but that wouldn't work too well if you wanted to make any changes to the labelset.
The New Source form's look has been updated. This was the result of days of tinkering with CSS and trying to learn its... idiosyncrasies.
Bottom part of the form. The required parameters for Point Generation will depend on the Point generation type selected.
Sunday, August 14, 2011
Screenshots: Users, Projects, Images
It's all fairly minimal and rough around the edges, but here's what our website has so far.
Home page:
User page:
Project page:
Editing project details:
(Note that we're currently calling projects "Sources", but we're going to just call them "Projects" soon.)
Image upload page:
Browse a project's images:
(Later, this page will have many features related to image annotations, such as statistics on the images.)
Image details:
Friday, July 8, 2011
Week 2(7/5 - 7/8)
This week we talked about our progress on the project at the weekly CVCE meeting. We got some feedback that we should focus on exporting data first and then on visualization. Professor Belongie had also suggested earlier that outputting all the data to Google Docs and to CSV would be a good way to go.
In terms of coding, this week the user system was implemented. Messaging is not completed, but that was more of a nice-to-have than a required feature. There was also a suggestion to have messaging be based around annotations/annotation efforts and to have a forum style discussion be available for each image, annotation, annotation effort, image source.
Thursday, June 30, 2011
The Internet Coral Database (CoralNet)
This is the development blog for the Internet Coral Database (CoralNet for short), a web application for the annotation and analysis of coral reefs.
Devang and I plan to use this blog to track our progress and our short-term and long-term development plans.
For now, though, most of our content is on the GitHub Wiki.
Devang and I plan to use this blog to track our progress and our short-term and long-term development plans.
For now, though, most of our content is on the GitHub Wiki.
Subscribe to:
Posts (Atom)