Google Summer of Code, point of view of a new admin/org

It's been a month since it has been announced that AnkiDroid was selected for Google Summer of Code (GSoC). Here is the story of a new admin, in a first-time organization. As it’s standard to state, views are my own, not my employer nor my organization. It explains how we went from unprepared to a huge success even before the end of the application phase!

GSoC is a yearly event financed by Google. Organizations apply, describing their open source software, the languages and tools they use, and list some potential projects. Google then selects organization, lists them on their blog, and students can decide where to apply. The selected students can earn from 1500$ to 3500$ depending on the country they live in, paid by Google, and a nice line on their resume.

We are competing for attention with some of the biggest Open Source organizations in the world, from programing language and tools (django, Python, Elm Tooling, Gcc, Godot, Apache), OS (Debian, FreeBSD, Gentoo), command line tools (FFMpeg, Git, Gnu Mailman), user facing software (VLC, Chromium) and websites (internet archive)... I thought that most students would apply to a project they already know; one must be so proud to state that they contributed to Python or VLC that we would never be able to compete with them in attracting students. I mean, you're pretty sure in your career to meet people who code in Python and like it, and people who use VLC daily to watch movies, everybody will be thankful to you for being a part of such wonderful tools! Compared to them, AnkiDroid, with 2 million active users, with a big community of language learners and medical school students, there is no reason we would attract any developers. There are a bunch of developers using AnkiDroid of course, and I know plenty of them... but the ones I know are almost all contributors to AnkiDroid, so that leads to a huge bias!

So, let's say I was surprised when we got 180 emails from interested students. This led AnkiDroid team to have secrets, for the first time ever. Up to March, virtually every single discussion was public. However, we felt like we could not ask students for their feedback on the rules we would use to select them. Many would be biased to choose rules that would select them. Furthermore, it's not "the whole Ankidroid community" who would be mentors, but only four of us, so in a way, I find it acceptable that the actual people who would do the actual work choose the rules to select who we would mentor.

For the first time ever, I really felt like I deserved my role as maintainer. Previously, I only worked on Anki(Droid) code. This is to be compared to the first of the current maintainers, Mike, who did all the tasks to automate the deployment process and improve automated testing; and also did start the Open Collective. I have the power to manage the open collective if necessary, and to decide what to do with the 16k$ we currently have, but it's a power that I never wanted and that I'll try to use only if nobody else can do it. I used to only do code, and suddenly, I started writing a lot of documents, for Google's application, to students, and to people who gave us money too.

I had to write a letter to students explaining what we would take into account when we choose them, how they should apply, what we expect. We didn't prepare this in advance since we thought we'd get a few candidates we already knew. And we easily realized that to choose between 180 people, or even 16 people, we needed more formal criteria. I went to find Nicolas, the creator of AnkiDroid, who has dealt for years with GSoC in MediaWiki, to give us advice and feedback; that was extremely helpful. I'm kind of sad that students can not thank him directly, but I guess that dealing with MediaWiki (the tool behind wikipedia), means that AnkiDroid is actually a small project for him!

Suddenly, I understood why people I application for internship with MediaWiki, InkScape or Git are so hard. Why they require one or two Pull Requests (code contribution) from each candidate. I find it distasteful to require students for free work in order for them to compete for money. Discovering a code base takes, at best, hours of work. And GSoC attract so many people that most student will not be selected. I'm a-okay with it because AnkiDroid has an Open Collective, with 16k$ in our account, so we could pay a little bit of money to the students if they want; but still we only pay 10$/hour and at most 200$/month. It's not a job, but a tip from our users to a free software[1]. To be fair, the same rules apply to maintainers, we are not asking anything from the students that we don't require of ourselves first. Clearly, we had to have these kinds of drastic requirements, asking people to act to prove their interest.

On the other hand, we were ready to help people onboard, we took time - a lot of time - answering questions, annotating issues as "good first issues" so that they can have something to work on[2], reviewing. I lost time reviewing. There were many errors that students made that I could correct myself quickly. Correcting code interpreted by a computer is so much easier than explaining something to a human. However, if I did the correction, the student/contributor would not have learned, and that would be a clear loss in the middle term. We spent an incredible amount of time teaching about atomic commits, commit messages, rebasing interactively to correct typos and errors instead of adding a commit on top of another. We tried to explain why we would want that tests pass on each intermediate commits. That small commits help the reviewers review more easily. In March, we got 38 contributors[3], while we only got 26 drafts of GSoC applications. I suspect that some people realized that it's hard to contribute correctly, harder in some sense than doing an academic exercise, and decided not to go through the whole application process.

I want to mention some PR that were important to me. One task in AnkiDroid was slow. It was rarely done but could take up dozens of second when the user runs it. I did try to optimize it as much as possible but could not get something correct without rewriting fundamentally the database layer. One contributor made a one line change, a where in a query, and that saved quite some time; somehow I totally missed it. Another contributor wanted to uncouple multiple elements in a big class in order to add features. This led to splitting a PR in two preliminary parts, and one of those preliminary parts was also split into another simpler PR!

So many contributors simultaneously meant that, for the first time ever, we really had to require people to ask us to be assigned tasks. I think we had one or two conflicts where people corrected the same bug. That almost never occurred before.

There are three people we rejected before they submitted their application. In all cases, it was people who we were not able to discuss with. However, the three cases were very different. One person failed to understand how to copy and fill the Google Doc template, nor how to join the discord server the team uses to discuss. Another asked us multiple times to review an application which did not follow the template and where a lot of information was missing. The last one was someone very talkative. It was one of the longest application I did read, and I could not make sense out of it. For example, they explained that the PR they submitted saved a lot of time, and when we answered that: it does not compile, and even if it did, we wanted actual measured numbers, we got an answer that they are also a developer and that it's clear it saves time... Generally, most of the conversation seemed empty of actual technical meaning, and I totally failed to explain what we required in the team.

We required candidates to also have written one test. Except that we didn't clearly explain what our rules about tests were and how to find missing tests. So I wrote a test document on our wiki.

I thought that our codebase is not really excellent, even if I tried to improve it as much as possible since I joined. I thought it would be hard and only people motivated by their love for AnkiDroid would take the time to understand and contribute. At best, I was expecting some simple little issues and tests. I was really astonished to find people who just discovered Anki actually make real non-trivial quality changes that would have taken me more than an hour - and probably took them far more since they didn't know the codebase. I'm happy that "have used AnkiDroid previously" was a preference and not a requirement, as I would not have wanted to reject such good contributions arbitrarily; even if I'm not sure what motivates them to participate.

Multiple people wanted to improve our UI. It's really old-looking, we are not a beautiful app. This is a strong complaint from new users, and it's quite probable that we lost a lot of them because of that. Medical school students are so desperate that if someone they trust tells them we are a really great tool, they'll try it, but more casual learners may not care so much that they'll try something that looks old. However, we rejected most of the UI change proposals. I also wrote a wiki page explaining why so that we don't have to repeat ourselves over and over and have a source of truth that all maintaineurs agree on. Essentially, it's really easy to have a strong opinion about the size of texts, a color, and really long discussion can ensue. We DON'T want to deal with it and will not deal with it unless the person can convince us with real arguments. Improving accessibilities is great. Adding a missing feature too. But just changing to make something more beautiful is not acceptable currently. We have very vocal hard-core users, they want to keep the app not distractive and very basic to use, so they could concentrate on learning.

There are also people who arrive with proposals that make no sense to us. In both cases, people wanted to introduce new features, because it's cool, because every good app has it, and in both cases they failed to answer how it would actually benefit our users. As an example, biometric identification has absolutely NO interest since we save everything in a "media" folder, plus a database in the user phone. Using encryption here would require rewriting a big part of the backend. We don't ever want to have to deal with security at the level of AnkiDroid; if ever the user needs privacy, they should do it at the phone level, we know that we are not competent to do it right and we won't give false safety to our users.

There were only two PRs where I had to ask the contributor for change due to efficiency concerns. One was making O(n^2) work where O(n) could be done. It was easily corrected by using some more efficient sql query, and honestly, I'm happy the contributor understood what I answered, because I don't know how I could have explained the problem if they didn't already know how to consider this kind of question. In the other PR, the contributor was committing data to the database immediately instead of using buffering, saving the data to change in RAM and saving when the remainder of the system considers that it's saving time.

Luckily, I planned to take 11 days of holiday before easter. This gave me plenty of time to do all of this work. The trouble is that I realized that I was starting to be pretty quickly frustrated. I do perfectly know that I can NOT ask every candidate to read and remember every single thing we wrote. The entire wiki, the letter to candidates, the hints given in the template. Worse, when I gave feedback to a candidate, for the sake of fairness, I published it - anonymized and generalized - to all candidates in a #feedback chan. This chan is 4813 words long now, so of course people can't remember every random thing I wrote. However, it still felt frustrating when I had to repeat something a third time, to a third person, realizing that all of the previous work does not mean everyone knows what we asked them to do in detail. It is my personal rules that I always start PR reviews by thanking the person and explaining why the work is great[4]. It's far harder to start review of student applications with positive feedback, I'm not yet sure why. I assume that reading applications is just not something I find as intrinsically rewarding as reading code; I don't expect to be learning new things - in the sense that there is never a moment where I think "this is wise, I would not have thought about it, I hope I'll reuse this techniques to write better".

I really feel outside of my comfort zone. I love to code. I never intended to have to deal with 50+ more people in a month. It's not just that I encountered that many people, even if I rarely meet so many people at once. It's that those people depend on me, on my feedback, they know that I'll take a decision that can have a huge impact on their summer, and thus their resume. I had to play the role of someone who knows what he is doing, not only in terms of code, but in terms of higher-level decisions. This is entirely new and unexpected. I believe I'm doing it correctly, but I also want time for myself, and I don't want the newcomers to be blocked for multiple days either. Both other maintainers are also overwhelmed, so adding my plates to their would not be nice.

To be honest, one unexpected thing is that some contributors who joined 3 weeks ago already started to answer questions and give advice to people who arrived a week ago. That's really beautiful to see. I do not always agree, sometimes I catch a mistake because, usually because they don't have the higher-level view, but that's still really really cool, and I look forward to giving them reviewer rights in a few months if they are still here.


[1] I decided not to take a single cent. The amount I could make is so low compared to my job income that it's not worth the time it would spend to declare it to the tax administrations and to get my employer's approval.

[2] We already had some good first issues. Not enough for 30+ people!

[3] We had 37 contributors in the whole of 2020!

[4] except for current core contributors. I don't feel I need to ensure that core contributors still feel welcomed. After all they have the keys.