At work, we use Perforce for source control. Perforce is the most popular source control system that I know of. There are some reasons for this: it is fast, it has a lot of features, and it has good support.
Before Perforce we used Visual Source Safe. VSS has many failings. The database sometimes gets corrupted. It’s slow. It lacks features. It’s why we went shopping for a new source control system. But one thing good about it is that once you get it working and you backup your database you can mostly focus on your work. Artists can use it and programmers can use it and you don’t need someone full-time supporting it.
Perforce fixes two of the problems with VSS: it’s fast and it has a lot of features. But it brings with it its own set of problems. Before where we didn’t have anyone supporting source control, we now have a technical director supporting Perforce nearly full-time. Support drags other people, including myself, into issues. For example, I had to sit through 15 minutes of a meeting yesterday listening to a proposal for a script on how to update artists from Perforce correctly. I’ve been here two hours today and have already overheard two conversations concerning recovering from Perforce related problems. Because of usability issues I’ve stopped checking checking in more than once a month. It’s an issue often enough that when people say “Perforce fucked up” they always add an “again” to the end of that sentence. Now in all fairness 90% of the time problem is human error. But that’s part of my point – that’s 9X more problems than should be occuring. Why are there so many usability problems and why are people making so many mistakes when using Perforce? Because Perforce is fundamentally flawed in how they update their repository finite state machine.
According to Wikipedia, a finite state machine is a model of behavior composed of states, transitions and actions. A state stores information about the past, i.e. it reflects the input changes from the system start to the present moment. A transition indicates a state change and is described by a condition that would need to be fulfilled to enable the transition. An action is a description of an activity that is to be performed at a given moment.
How do you determine the conditions that must be fulfilled to enable the transition? There are two ways to do this: event driven and through polling. With event based updates, you find every situation where the event happens and trigger the condition. With polling, the state itself (or possibly the condition) will check all other relevant states to see if it should update. If so, it triggers the condition to change the state.
This is much simpler than it sounds. Let takes a hypothetical game where the player has a health value and a death animation. There is a rule where if the player’s health reaches 0, the player plays a death animation.
With event based systems, you would find every place in the code that modifies the player’s health (barrel falling on you, bullet hitting you, falling off a cliff). If, after that event, your health is 0, you play the death animation. This system is efficient but has an important problem. If a new scenario is introduced (spike trap), and the programmer forgets to trigger the death animation, your system now has inconsistent states. In this case you’d have a bug in your game where the player is able to run about, perhaps invulnerable, after falling on a spike trap. This is an especially common problem in network programming where new remotely triggered events are introduced for most actions in the game.
With polling systems, you have the destination state check its own transition triggers. So in the scenario above, once per cycle you would have the character check his own health. If, for any reason his health becomes 0 then he changes to the death animation. This is nice because now you can add as many health modifications you want and never have to worry about the player running around invulnerable. But it too has an important problem: it’s inefficient. Even when your player is inside an insane asylum, with padded walls, no bullets, no barrels, and no cliffs, you are still checking to see if his health reaches 0.
If you think about it, most computer and many non-computer systems can be envisioned as states, with corresponding event-based or polling checks. Menu systems often use event-based transitions (OnMouseClick) while AI often uses polling (Is there an enemy in front of me?). In baseball, the umpire calling “Strike!” is an event based transition while watching an instant replay to check the results uses polling to break the ball updates into discrete frames and to check each frame (Slow-motion replay) for the transitional condition.
The Perforce update system is also a finite state machine, and is event-based. Perforce, being event based, is fast but error prone. It defines a specific set of conditions to update the database. The primary condition is that you check out a file, update it, and check it in. If you deviate from this model then you get inconsistent states. For example, if you update a file locally and later update that file from source control you lose your work. If you add a file locally but forget to check it in, nobody can build the project. If you delete a file locally and later update the file is back again. There’s no way to say “Just use what is on my harddrive!” Perforce is complicated and there are many ways to get inconsistent states. If you know exactly what you are doing and never forget what you’ve done between updates you are OK. If that isn’t true (which is to say you are a human and make human errors) then you are in trouble because you are going to have problems ranging from lost time to lost work.
It’s easy to say “Just don’t deviate from that model, stupid!” and in fact some people do say that to the artists. But it’s something that shouldn’t need saying because the flaw is not in the artists but in the system.
The logical system for updates is:
- You have completed work on your harddrive
- You upload the state of your harddrive to source control.
- You do work
- Go to 1.
You don’t want to and shouldn’t have to think about source control while you are working – you only want to think about it while you are updating. The fact that you do have to is because Perforce is event-based, and event-based in such a way that if you don’t work following a certain methodology Perforce won’t update correctly. Is this the fault of the person working, or a design flaw with the source control system? I believe that it’s the latter. As I work at a game company there are many artists. Artists care about conceptual models – they don’t care about leaky abstractions such as what version control system you are using. They just want to do their art and want the game to work. As the company gets bigger and there are more artists you get more errors. Eventually you reach the point to where every day I hear coworkers complaining about problems related to Perforce and someone has to work nearly full time just supporting a tool In my opinion this shouldn’t be necessary and why I think that Perforce sucks.
For my game network library RakNet I use Sourcegear Vault. While not perfect, it supports both ways of updating (event-based and polling). With polling, it takes a second longer to scan my harddrive but
- I’ve never missed checking in a file
- I’ve never overwritten work I did locally
- I can do work on another computer, copy it over, and still upload and update from the repository
- I’ve never had to spend a lot of time recovering from a problem caused by source control
None of these things are true for Perforce and all of these things are true for Sourcegear Vault. As you get more users, Perforce gets worse because if each user has a p percentile chance of making a mistake per day, then the odds you will go a day without mistakes is (1-p)^n where n is the number of users. At p=1% and n=100 you only have a 33% chance per day of going a day without someone losing work or breaking the build directly because of Perforce.
Which is why Perforce is not suitable for large projects.