Home Download Buy Blog Forum Support

Automatically detecting indentation

Automatically detecting indentation

Postby jps on Mon Mar 31, 2008 6:42 am

(Split off from Automatic Backup Plugin)

sublimator wrote:Maybe we could create a plugin, that searches the header comments for meta information, such as indentation size? I wonder if things like that will be settable on a per project basis.


There's no reason a plugin couldn't be created that set the tabSize and translateTabsToSpaces settings via onLoad. It's pretty simple to do a reasonably good job at detecting indentation settings:

* Grab the first couple of hundred of lines from the file (Need to do enough so that long header comments don't break things, and not so many that it keels over on million line log files)
* Remove all the lines that start with no white space or only a single space
* Of the remaining lines (assuming there are enough to make a valid attempt, say > 10):
- If the majority start with a tab, set translateTabsToSpaces to be true
- Otherwise, set the tabSize to the highest value such that 90% of the lines have the property numberOfLeadingSpaces % tabSize == 0
jps
Site Admin
 
Posts: 3062
Joined: Wed Mar 19, 2008 12:33 pm

Re: Automatically detecting indentation

Postby jps on Mon Mar 31, 2008 6:50 am

sublimator wrote:Sweet, are you planning on doing one? Thanks for the recipe, regardless :)


Eventually, unless someone else does it first ;)

sublimator wrote:Maybe, a shell script onLoad hook to so read the file type from the shebang #! ??


I've got a item on the todo list to make this part of the current file loading process. Doing it via a plugin isn't the best way to go, as changing the syntax will cause the entire file to be re-lexed again, which isn't as efficient as doing it with the correct syntax definition in the first place.
jps
Site Admin
 
Posts: 3062
Joined: Wed Mar 19, 2008 12:33 pm

Re: Automatically detecting indentation

Postby jps on Mon Mar 31, 2008 1:15 pm

Cool!

Comments:
- Is the setTimeout actually needed? I would have thought you could call detectIndentation directly from onLoad.
- You probably meant min() below, rather than max().
Code: Select all
sample = view.substr(sublime.Region(0, max(view.size(), 25000)))

- You can pass bools and ints to options.set(), i.e.,
Code: Select all
view.options().set('translateTabsToSpaces', True)


Also, instead of:

Code: Select all
for indent in sorted(spacesList):


I think you'll get better results by doing:

Code: Select all
for indent in xrange(8, 1, -1):


because you want to test from largest to smallest (otherwise a single 2 space indentation in an otherwise 4 space indented file will yield a tabSize of 2), and you only need to test each indentation level once, so you may as well just iterate over the small set of possibilities.

Finally, before setting translateTabsToSpaces to false, it's probably worth checking that there are at least a handful of tabs in the file, otherwise opening an empty file will automatically clobber your default translateTabsToSpaces setting.

I'm looking forward to including this in the default distribution :)
jps
Site Admin
 
Posts: 3062
Joined: Wed Mar 19, 2008 12:33 pm

Re: Automatically detecting indentation

Postby jps on Mon Mar 31, 2008 1:30 pm

Lookin good :)

tab cut off as in how many characters to check? 25000 seems pretty reasonable, certainly won't need any more than that.
jps
Site Admin
 
Posts: 3062
Joined: Wed Mar 19, 2008 12:33 pm

Re: Automatically detecting indentation

Postby jps on Mon Mar 31, 2008 1:41 pm

If every line in the file starts with a tab, will translateTabsToSpaces ever get set to False? It seems that startsWithTab and sampleLines will be equal in that case.

Also, it's likely worth making it a TextCommand rather than just a Plugin, so it can be run explicitly (say, after creating a new file and pasting a bunch of text in).
jps
Site Admin
 
Posts: 3062
Joined: Wed Mar 19, 2008 12:33 pm

Re: Automatically detecting indentation

Postby jps on Mon Mar 31, 2008 2:01 pm

ah ha, yes, that was me being foolish :)
jps
Site Admin
 
Posts: 3062
Joined: Wed Mar 19, 2008 12:33 pm

Re: Automatically detecting indentation

Postby jps on Mon Mar 31, 2008 2:13 pm

I'm loving this plugin, btw :)

I was just playing with it here, and had a file almost-but-not-quite get detected as having 4 spaces: Just under 90% of the lines were detected as having 4 spaces, it was thrown off by 3 lines which were nothing but a few spaces. I think a couple of tweaks to the heuristics:

- If a line is all whitespace, ignore it, as it's not actually indenting anything
- An 80% match is likely good enough, rather than 90%
jps
Site Admin
 
Posts: 3062
Joined: Wed Mar 19, 2008 12:33 pm

Re: Automatically detecting indentation

Postby jps on Mon Mar 31, 2008 2:34 pm

...I'll just stick it in the next beta, life is simpler that way.
jps
Site Admin
 
Posts: 3062
Joined: Wed Mar 19, 2008 12:33 pm

Re: Automatically detecting indentation

Postby jps on Mon Mar 31, 2008 2:47 pm

Well... next beta will be out pretty soon.

My default settings are to just insert tabs, which means when I edit a python file indented with spaces, it gets nasty tabs added. This plugin makes it just work, which is pleasant.
jps
Site Admin
 
Posts: 3062
Joined: Wed Mar 19, 2008 12:33 pm

Re: Automatically detecting indentation

Postby jps on Mon Mar 31, 2008 2:51 pm

Next beta will be out tonight or tomorrow night.
jps
Site Admin
 
Posts: 3062
Joined: Wed Mar 19, 2008 12:33 pm

Next

Return to Plugin Announcements

Who is online

Users browsing this forum: No registered users and 15 guests