Dcc Kit package Critiques: some Python3 modules for clunky 3D asset pipeline

Il_Berna · October 8, 2020, 9:04pm

Hi all!
As I told today on Slack, I wanted to share something with you, at last.
I tried to wrote a couple of modules for helping me during Blender addon writings for our cluncky pipeline.
While overcomplicating things, I tried to expand it in order to write something that could help writing scripts in different DCC softwares.
It probably is noobish stuff that could be done in half the line of code, but I tried
I don’t dare to say that any of you could find this useful, it’s more like “exiting my comfort zone” showing what I did and looking for critiques… and learning
If some of you would like to give me ant feedback, on coding, structure, idea, whatever I’ll take it with pleasure: for me it’s a chance to learn and improve my skills.
So making it short, this is the github link:
dcckit github
More infos on the package and the idea behind it are in the repos Readme.

This is my very first shared code… but please don’t be nice with it

Thanks for any comments on this!

Theodox · October 9, 2020, 4:32am

That’s pretty clean for a first public Github! My early ones were fugly And congrats on being willing to put it out there – this is a good way for the community to learn together.

Somebody will probably disagree with me on stuff below and that’s a good learning opportuntiy for everybody too !

Another thing that is very cool and way too often neglected is the clear and well written readme: people always forget that nobody else knows this stuff they way the author does and skimp on this bit – raw code is not “self explanatory” no matter what the crusty old programmer on your team might say. Good context makes for better code reviews since it sets expectations well. The main thing that the public presentation needs is some example code – I know you’re not really chasing GitHub fame here but when a project is trying to attract users it really helps for a browsing person to be able to see what the package will do to the layout and feel of their code if they use it.

What needs to be a class?

You correctly noted in your intro that this is a qiuestion here.

A good rule of thumb in Python (this is not true in, say, C#) is that too many methods in a class which dont’ use self is a bad sign. For example in export_asset() the self references are to the scene name and file path: you could easily have passed those in as parameters instead of delegating them to the __init__() As written you’ll need to create a new DCC subclass for each file , which is not a big deal but tends to show that the “class” here acting more like a namespace and less like a living object with state that needs to be managed. It’s handy to have that self for keeping related data together, but here it’s hard to imagine a scenario where you need two of these class objects to exist simultaneously with different contents – a good indication that this might be doable as static functions instead.

Don’t do hard work in constructors

One nice thing about functions is that they are a little more clear about when work is actually happening. For example as written self.scene is being populated in the constructor:

       scene_tree_root = self.query_current_scene_tree()   
       self.scene = Scene3d(scene_name, scene_tree_root, scene_filepath)

So if the calling code creates a new BlenderDCC on a large scene, all of the analysis of the scene is going to happen when the object is created and not when the export is actually run – this could create surprising behaviors if, for example, this code is called when an export dialog is opened but before the user has actually hit “export” – they could add something to the scene that was not captured because the exporter’s view of the scene derives from the birth of the object, not the execution of export_asset()

think about when to make lists

That brings up a side point which you can ignore if your use cases are not very big, but which will matter when you’re dealing with larger scenes: You’re making a data structure which duplicates the scene hierarchy, which is not big deal for a few hundred or even a few thousand objects but might become sluggish if you’re dealing with another couple of orders of magnitude. You need the data in some form to select the objects for export – but could you write this so that rather than building a big list and then selecting it’s contents, you iteratively selected the objects at export time without making the list? In many applications it wont matter but in data heavy cases --especially one where Python has to keep extending the size of that duplicated list as it grows – it can be slower.

That said – if it’s working and its acceptable, don’t sweat it. Avoiding the runtime creation of big lists is a good habit to have but it won’t kill you.

Do you need multiple DCCs?

All that said, in this case the argument for a class is mostly that you are delegating responsibility for the process of finding the scene name and path to the subclasses, which does help a bit with reuse. However that might also be accomplished by static marrying static functions for doing the work to dcc-specific static functions for getting things like the scene name. A good compromise might be to keep the cross-dcc logic in one class that consumes other classes that do the app-specific work:

Personally I’m a sceptic about writing “dcc agnostic” software – my experience has generally been that the implementation detail in a package tends to massively outweigh the common elements so the extra layer of abstraction doesn’t pay for itself. The rule of thumb I learned the hard ways is Generalize only when you have to. But that’s just me – your use case may really require it. I’d only say don’t do it unless you must.

Tell users how you expect them to specialize

If you do go down the multiple DCC route, you want to be very careful about which portions of the inherited api are DCC specific and which are not. Inheritance is one way to do this – though you could accomplish the same goal with modules that had a common set of static functions for each dcc without the extra baggage of classes. That’s a personal style preference of mine when the universe of potential derived versions is small (eg: Maya, Max and Blender) but if there were 12 different DCC apps it would make more sense to use a class.

If you do decide to use inheritance for specialization, you want to be very deliberate about how you want future coders to do the detail work. One thing that struck me was that your comment suggests export_asset() wants to be overridden in a specific DCC – but if somebody takes that literally they’ll be re-implementing lines 356 - 396, though I see you did the smart thing and used super(). It can be kinder to potential users to separate the methods that are intended to be common (maybe giving them the visual distinction of a leading underscore or a name like export_asset_common()) from the ones that are publicly overridable. In that style the export_asset() in the base class would would raise NotImplemented to force people to do the derivation and explicitly suggest that people call the internal method to get the shared functionality.

Don’t let the length of the comments here put you off – this is way better than what real newbies do . It’s got docstring and logical names and makes sense, which is not a trivial achievement.

Il_Berna · October 9, 2020, 1:27pm

First of all, thanks @Theodox for your deep answer and analysis: this is exactly what I hoped to read here

As I said my coding knowledge are quite meh so bear with me if I won’t get everything from your post at a first glance.

Well… this is encouraging!

If I correctly understand, your point is if a method makes a very little use of self attributes/other methods, it could be better to not make it a class method at all, right?
In export_asset() I also have
asset_to_export = self.scene.search_asset_by_name(asset_name)
because the Dcc asks the scene member to provide the asset to export. How could I manage this?

You’re right: in tehory I’ll never have 2 Dcc instances at the same time, because just one operates on the scene.

Theodox:

Don’t do hard work in constructors

[…] For example as written self.scene is being populated in the constructor:
       scene_tree_root = self.query_current_scene_tree()   
       self.scene = Scene3d(scene_name, scene_tree_root, scene_filepath)  
So if the calling code creates a new BlenderDCC on a large scene, all of the analysis of the scene is going to happen when the object is created and not when the export is actually run […] for example, this code is called when an export dialog is opened but before the user has actually hit “export”

And this is exactly what happens in the currently used version of the Blender addon: when artist launches it, a PySide dialog appears, listing all scene Assets, each one coupled with its Group name, if any (Tags weren’t present in that version).
To populate the dialog, I recursed on the scene, basically doing what now is done inside
self.query_current_scene_tree()
For example, this is my old dialog on a scene containing 3 different groups (TV models) and with ModA having 2 TVs inside it.

At the moment, the only way I have to show this to the artist and allow him/her to select what to export is to build this tree, and as you wrote this happens before the actual “export” action.
I could maybe delay all the Asset3d and Scene3D creation, doing only a quick recurse on names only, that should be faster.

I think that this could relate to the previous point: I still need to have at least a list or something of just asset names, but I could try to delay the Primitives, Assets and Scene instantiation. But not really sure how to handle this

Mmh… but I still need a subclass of Dcc because I also need to perform the scene parsing (and every software has its hown method to represent the hierarchy) and the export action (done with its python API), so I thought that the subclass is actually doing more than querying names… or not?

I know that, I think you wrote this a couple of times on Slack too.
And we’ll dismiss at all 3dsmax in the future, so it could be the case that I’ll never have the need to actually write a MaxDcc class.
I did this because it seemebd to me that something could be software agnostic and I thought it could be a nice chance to practice coding and (let me say) give something to the community, in a far future, that could help and rewards for all I got from the community.
I’m still puzzled by the idea of cross-software tools, given the chance to use a common language (aka Python)… but it’s just me: I’m sure your experience on this count more than my feeling.

Correct… maybe override is not the proper word.
But using super() is exactly my idea: parent class export_asset() has the base code, it does the first part of the job… then the child class finishes the export task.

So following this approach I should move all parent’s code from export_asset() to export_asset_common(), and make export_asset() just raise a NotImplemented (it’s an Exception, I suppose).
Then child class overrides (well, implement) export_asset(), just like it’s doing now.
Correct?

This makes me happy and cures a little my imposter syndrome
And I commented with another nice wall of text!
Thanks again bud!

Theodox · October 9, 2020, 3:06pm

on the first bit, anyway, this is a great guide to the thinking:

It’s not as ranty as the title makes it sound

On the list creation bit, it seems like ultimately you care about having a collection of a small-ish number of scene item paths that will become the selection for your exported file – as long as that list is small it’s not too bad either way you’ll be fine. Large lists which stay in memory for a long time are good to avoid on principle, but in a case like this where we are talking 10^4 or less it is not life or death.

Il_Berna · October 9, 2020, 8:52pm

Will watch it!

I never expect to have such large scenes: we use to organize our game assets into smaller .blend/.max files, each one containing usually no more than 20 or so assets.

Regarding your previous advice about
> Tell users how you expect them to specialize
I branched my repo doing a couple of changes following your idea…
Dunno how to use pull requests actually, but I tried, so I think you can see the change. (are branches without pull requests visible to other people?)

Cheers!

Theodox · October 10, 2020, 3:41am

Yes, they can see the branches from the main ui (click on “main” ) – this should show the changes:

And that’s definitely along the lines I was thinking. My personal style would be to include the expected sample code in the docstring of the to-be-overridden method but anything which clearly tells the intended user the expected pattern is fine – the nice thing about Python is that they can see the source code, after all.

Il_Berna · October 11, 2020, 8:51pm

Before going to bed, I’m thinking about what Steve said before about…

That’s right.
I need to instantiate my Scene3d and populate it if I want to show a list of exportable assets in my GUI, but maybe could be better to not do this inside the __init__?

Now my code is:

def __init__(self):
    scene_name = self.query_current_scene_name()  # Get the currently opened scene name from scene file
    scene_filepath = self.query_current_scene_filepath()  # Get the currently opened scene filepath
    scene_tree_root = self.query_current_scene_tree()   # Build the assets hierarchy from the scene
    self.scene = Scene3d(scene_name, scene_tree_root, scene_filepath)  # Create and store a Scene3d object
    self.scene_file_type = ""  # Scene file extension

Wath if I change it with something like:

def __init__(self):
    self.scene = None
    self.scene_file_type = ""  # Scene file extension

And then add another class method like:

def read_scene(self):
    scene_name = self.query_current_scene_name()  # Get the currently opened scene name from scene file
    scene_filepath = self.query_current_scene_filepath()  # Get the currently opened scene filepath
    scene_tree_root = self.query_current_scene_tree()   # Build the assets hierarchy from the scene
    return Scene3d(scene_name, scene_tree_root, scene_filepath)  # Create a Scene3d object

So I could instantiate my Dcc and populate the scene only when needed doing

self.scene = self.read_scene() ?

Or let read_scene() directly set the self.scene member instead of returning it.

This shouldn’t give me any real benefit for my GUI, since I still need the populated scene when I open it (to show the filled list), but could make the Dcc creation faster in a case where I don’t need the list at first.
Or, I could even open the GUI showing nothing and add a button like “Detect Assets” or “Update Assets List” linked to the Dcc.read_scene()

Not sure exactly how to improve this.

Another idea is, since I’ll have one and only one Dcc class at a time… do I really need to instantiate it?
Why not using only class methods?
Is it possible to simply do Dcc.something() instead of

my_dcc = Dcc()
my_dcc.something()

?

Theodox · October 12, 2020, 2:47pm

There are two classic tricks for this kind of situation.

One would be to do the work asynchronously – so when the dialog is invoked you start a separate job (in maya, you’d probably time-slice using an idle-time scriptjob) that incrementally populates the list. That is actually slower than just doing it once , but since the user gets UI earlier and sees the work being done it feels faster.

The other would be to maintain this list – which is scene specific – at all times, adding to it incrementally as the scene is edited. You’d still have a one-time hit (on file open) but as long as you had a good way of maintaining the list quietly in the background you’d be spreading the work of tweaking things into the right shape over a whole work session.

From your description it sounds like you could just keep doing what you’re doing, since “optimizing” a few dozen list operations is not a great investment – but from the standpoint of good habits it’s a good idea to learn how to hide big chunks of work so the user doesn’t realize they are going on…

Il_Berna · October 14, 2020, 8:34pm

Yes I don’t think I’ll hit that problem in our projects: asset’s lists will be quite short I suppose.
The 2 ways you proposed are both interesting: not going to test now but I’m wondering which one could be easier or better.

Il_Berna · October 21, 2020, 8:26pm

Doing a quick test in Blender I found 2 bugs introduced while rewriting methods…
I wrote (in BlenderDcc class)

def __init__(self):
    super().__init__(self)
    self.context = self.build_context()
    self.scene_file_type = "blend"

That self argument in the super init call was wrong
Plus I need to move the self.context init before the super() call, or it will fail when recursing the Blender scene (I need the context set there).

But why a method for just getting bpy.context?
Well… Context in Blender is quite messy and on Slack we already talked about how it could be easy to have strange situations where you call a script from some unpredictable context.
So my idea is to (in the future) use a smarter way to actually build an ad hoc context instead of just getting the bpy one.
But for now, I’m keeping it, together with my fingers crossed.

Il_Berna · November 25, 2020, 9:29pm

I had some time to work again on this small project and I found some bugs around, now solved and on git.
But most important I was able to (with quite a decent effort…) refactor my custom Blender exporter: I basically stripped from it all assets/scene logic and imported this dcc kit.
I’m happy to say that it still works!
While actually using this package in a real tool I found also some stuff that aren’t really ok (for example the method for building the full asset name, tags included).

The next step is to store some more info about an asset, like the polycount.
Why?
Well, I’d like to add some “asset validation” code too.
Could be nice to have some “rules file” (containing requirements for the asset) and perform some validation. Could be usefull in combination with the exporter.
Any ideas?
What could be the best approach for this “rules file”?

Theodox · November 26, 2020, 12:49am

Depends a bit on the problem space.

If you know the range of things you’d do on a lot of projects, it can be nice to move all the implementation into a library of known functions and then just have a piece of data – an INI or a JSON file or similar – that hooks up the validations to the asset types. This works well if you have a lot of use cases with overlapping but not identical requirements.

On the other hand if you don’t know the problem space well (Do I need to count polygons? Do I have limited texture formats? Will I have animations as well as models? etc) you’ll have to write the code as you go along. The only advice I’d offer there it to treat every individual validation as separate and not to lump them into a single monster validate() function. it’s much better to have a bunch of separate, very specific functions:

HasCorrectNaming()
AllTexturePathsInsideProject()
NoNurbsGeometry()
AllIntegerKeyframes()
NoSimulationsEnabled()

or whatever your validations are – these should be able to run individually so when you realize that, say, you care about keyframes in your animation files but only care about texture paths in your model files, you can easily make two different validation paths that share common things (like correct naming) but you don’t have to track dozens of if...then s inside the function to get appropriate results. The quotable buzzword here is prefer compositon over inheritance:

Also – as a practical matter – validation is almost always stateless – so most validation tasks work well as static functions, they rarely need to be class instances. Lots of individual functions that validate individual instances are easy to test and maintain. Then you can check the overall status as a checklist or list of pass-fail results – much easier to reason about than a 1000 line monster function wilt lots of potential code paths.

Il_Berna · November 26, 2020, 8:06am

Good hintd as always @Theodox, thx!
Keeping validation steps separate sounds fine,and also good for extending it during time.
Static should be fine too, if I manage to store all the info into my Asset class, unless how to do validation for a specific software?
The dcc agnostic class is still there.
Maybe I could made them as static methods of the master Dcc class, so it will be something like dcc.check_polycount(my_asset).

Or (but have to understand how to manage agnisticism), add a dvalidation.py in my.package with all static functions inside,no class.

Edit: Mmh… maybe I could keep all these functions static and with no class, but pass them a dcc object as param, like validate_this_stuff_in_a_dcc(asset,dcc)