Sublime Forum

Automatically detecting indentation

#1

(Split off from Automatic Backup Plugin)

There’s no reason a plugin couldn’t be created that set the tabSize and translateTabsToSpaces settings via onLoad. It’s pretty simple to do a reasonably good job at detecting indentation settings:

  • Grab the first couple of hundred of lines from the file (Need to do enough so that long header comments don’t break things, and not so many that it keels over on million line log files)
  • Remove all the lines that start with no white space or only a single space
  • Of the remaining lines (assuming there are enough to make a valid attempt, say > 10):
    • If the majority start with a tab, set translateTabsToSpaces to be true
    • Otherwise, set the tabSize to the highest value such that 90% of the lines have the property numberOfLeadingSpaces % tabSize == 0
0 Likes

20080401 Beta
#2

Eventually, unless someone else does it first :wink:

I’ve got a item on the todo list to make this part of the current file loading process. Doing it via a plugin isn’t the best way to go, as changing the syntax will cause the entire file to be re-lexed again, which isn’t as efficient as doing it with the correct syntax definition in the first place.

0 Likes

#3

Cool!

Comments:

  • Is the setTimeout actually needed? I would have thought you could call detectIndentation directly from onLoad.
  • You probably meant min() below, rather than max().
sample = view.substr(sublime.Region(0, max(view.size(), 25000)))
  • You can pass bools and ints to options.set(), i.e., view.options().set('translateTabsToSpaces', True)

Also, instead of:

for indent in sorted(spacesList):

I think you’ll get better results by doing:

for indent in xrange(8, 1, -1):

because you want to test from largest to smallest (otherwise a single 2 space indentation in an otherwise 4 space indented file will yield a tabSize of 2), and you only need to test each indentation level once, so you may as well just iterate over the small set of possibilities.

Finally, before setting translateTabsToSpaces to false, it’s probably worth checking that there are at least a handful of tabs in the file, otherwise opening an empty file will automatically clobber your default translateTabsToSpaces setting.

I’m looking forward to including this in the default distribution :slight_smile:

0 Likes

#4

Lookin good :slight_smile:

tab cut off as in how many characters to check? 25000 seems pretty reasonable, certainly won’t need any more than that.

0 Likes

#5

If every line in the file starts with a tab, will translateTabsToSpaces ever get set to False? It seems that startsWithTab and sampleLines will be equal in that case.

Also, it’s likely worth making it a TextCommand rather than just a Plugin, so it can be run explicitly (say, after creating a new file and pasting a bunch of text in).

0 Likes

#6

ah ha, yes, that was me being foolish :slight_smile:

0 Likes

#7

I’m loving this plugin, btw :slight_smile:

I was just playing with it here, and had a file almost-but-not-quite get detected as having 4 spaces: Just under 90% of the lines were detected as having 4 spaces, it was thrown off by 3 lines which were nothing but a few spaces. I think a couple of tweaks to the heuristics:

  • If a line is all whitespace, ignore it, as it’s not actually indenting anything
  • An 80% match is likely good enough, rather than 90%
0 Likes

#8

…I’ll just stick it in the next beta, life is simpler that way.

0 Likes

#9

Well… next beta will be out pretty soon.

My default settings are to just insert tabs, which means when I edit a python file indented with spaces, it gets nasty tabs added. This plugin makes it just work, which is pleasant.

0 Likes

#10

Next beta will be out tonight or tomorrow night.

0 Likes