Pipeline ramblings and questions

JHN · March 28, 2017, 9:22pm

Hi all,

Since I have given up on the concept of building a “killer pipeline”, I’m now basing the concept for our pipeline on a “scene centered” approach.
We’ve used max exclusively so the choice was simple, maxscript and dotnet. A bunch of tools and some conventions.
Sometime ago I delved into MongoDB for our database needs, and I very much like it.

Now I gearing up/researching a more centralized database driven approach, which will include a bunch of metadata on our scenes.
But I’m not looking into building the 1 management app (say alienbrain like) to rule the pipeline, rather to develop a bunch of somewhat related tools when I need them, but all inherit from the scene centered philosophy.

Now we still use max predominantly but other apps are making their appearance too, like modo for example and other tools that support python too.

So I’m tinkering with the idea of building python based components as the backbone and then mostly webbased tools for managing the pipeline, using python is outside my comfort zone for now so I would be investing a lot of time to learn it properly, but it feels that with the way our industry apps are going pyhon is more future proof than anything else. (Or is it?)
Is there a web framework or python server I could use as a base to develop webbased management tools and supports mongodb, like setting up folders for projects (as an simple example).

Any pointers or thoughts on the above are welcome, I realize it’s a very broad summary of how I’m thinking about developing, but python still scares me a bit as I’m not as productive with it as I am with our current tools.

Hope nobody was bored reading this!

Thanks,
-Johan

btribble · March 28, 2017, 9:46pm

The first big thing to consider is whether you are going to be implementing a dependency based build system, or one that “just builds everything”. Have you figured out which approach you’re going to take? In a dependency based system, a rig might be build because a model references it, because a level/world references the model, and because the world is referenced by some world editor DB file (or whatever). In the other approach, the build system just builds everything it sees whether it is referenced by something or not. There are advantages and disadvantages to both…

The system I’m working with now is a “just build everything” based system. There are rules for how certain directories in a directory tree are processed, but other than that, everything goes into the wood-chipper together. We end up with a dependency graph of what is used, by whom, etc, but this comes out at the end and doesn’t drive the system itself.

Another thing to figure out is where your asset metadata is going to live. If a model has a parameter “breakable”, or “mass”, or “trigger-object”, etc. where does this data live? It could exist as data within the model file itself (Max in your case), in a separate file “model_metadata.xml”, or in a DB. It sounds like you are favoring the DB approach. Again, there are ups and downs with each approach.

[ul]
[li]DB[/li]If you are storing your object metadata in a DB, how are you going to do “atomic” submits that keep the DB and revision control in sync? Say you want to rename a directory full of assets, you will need to rename them and queue that change up in your RCS package, then you need to rename the entries in the DB. How do you submit these changes together, and hopefully in such a way that you can revert this change should something go wrong? How do you submit 100 changes to the DB that corespond to the file rename operations without exposing this to others on the team mid-change?
[li]In File[/li]Since you’re adding metadata to the individual binary files (At least with Maya you have a .ma you can parse if need be), how are you going to batch process changes to those files? Let’s say you need to know which files are flagged as “breakable”, how do you come up with a list? Do you have to rely on the builder for this info? After a build has completed, and you have a list of files that need to be altered, how are you going to wrap Max so that it can grind through those 100 files to make the change? BTW, I have never had any art package remain “trustworthy” after opening more than a few dozen files. Having done this with both Max and Maya, you will need to have a system that either launches a new copy of the executable for each file processed, or which can in some way detect with things have “gone south” and allows you to pick up your batch process mid stream. Since you are working from a list generated by the last build, you may not catch all the cases where the parameter needs to be changed unless you’ve told the artists that the pool is closed.
[li]Separate File[/li]Hey, at least you can do a find in files with these, and determine how many “player_placeable” items there are (or whatever). How are you going to keep the data in these separate files in sync with what is in the asset (Max) file? Say you can mark a bone/joint in a rig as being an FX emitter. This data lives in the metadata file. What happens if an artist renames or removes that bone? What catches this, and conveys to the artist that this is their issue to fix? What will your file format be, and how will you edit it? Will you need to write an app that modifies them? (spoiler: you will) I would recommend using one of the standards for serialized data like this, but which one XML, JSON, Protobuf, INI. etc.? Each has varying degrees of favorability for specific tasks, and disadvantages in others. For example, INI files are easily human readable and grep-able, but their schema is flat and doesn’t extend or support complex data easily.
[/ul]

Anyway, there’s a couple things to think over…

Theodox · March 28, 2017, 9:46pm

Python is a good choice for the glue that holds it all together - it’s a very flexible language that handles trivial tasks cheaply (“rename all these files!”) but can also build complex applications. It also includes tons of functionality for free, so the hassles involved in keeping lots of workstations up to date with DLLs and libs are vastly reduced.

It’s bad at two important things.

First, it’s slow compared to things like C or C++, especially for high-speed calculations – you could write something math heavy like a lightmapper or a 3d editing app in Python, but you shouldn’t. Write the high performance parts in C, C++ or C# and use Python for glue <plug>or look at FabricEngine</plug>.

Second, it’s designed and implemented by the kind of people who think ‘run this shell script and compile on your machine with gcc’ is a distribution strategy. If you go python, you have to have a strategy for distributing your python environment that goes beyond “copy these files” – you need to manage paths and dependencies. It’s a perennial topic here (too late tonight to look up threads right now but there are a few just from the last few months).

A minor but real issue is that there is no single standard GUI library for python. QT is popular here (largely due to the Maya link), and alternatives like WX or TK are also easy to find; but unlike, say, the single python standard for downloading files via FTP none of these comes bundled with the language and all have to be manually packed up and distributed.

Still, on the whole python is the best language for pipeline development because (a) it’s in lots of packages, (b) it’s super capable and © it’s fun to write. For your web-based approach you could look at something like PyJSto maximize code reuse between the working parts and the web front end. There are great frameworks for databases (Django) and networking (Twisted) that make those sides “pretty simple” compared to, say, doing it from scratch in C++.

JHN · March 28, 2017, 9:46pm

I had written a lengthy reply but it somehow got lost. Lets try again today

I should have mentioned we’re not an game studio but an animation studio, think commercials, animated series, b2b animations, serious game projects etc. Projects with various scopes/budgets and deadlines. Assets are important as are scene’s, I think I need to think of both as the same sort of entity which have similar properties although scenes obviously have more dependencies.

But we are not very big on dependencies (yet!?). Dependencies of objects and files are not as important as feedback and QA is.

I still favor DB for our needs especially when I will mix it with a feedback system, where even the clients can submit (controlled!) feedback. Renaming/refactoring are much less of an issue for us especially because I’m thinking about some sort of URI or hashtag if you will, for files and objects etc.

@Theodox, there’s some very good info there. Especially distribution and I will look them up myself, thanks!

Also Qt I’m not sure about, especially with it being GPL, don’t know if we want such an important tool depending on such licenses, what happens when we open the pipeline for 3rd parties?!
I will definitely look into PyJS (have a lot of experience with web languages).

Thanks!

-Johan

Theodox · March 28, 2017, 9:46pm

You might also want to look at tactic, which is an open source asset management solution. I’m not sure how ‘open’ their open source is, but it might be a good place to start researching – and it’s written in python.

JHN · March 28, 2017, 9:46pm

Tactic is definitely being investigated! Thanks!

-Johan