Sublime Forum

Formatting data

#1

I regularly have to work with XML files which are presented without line breaks which makes tracking down errors much harder than it needs to be. eg

<?xml version="1.0" encoding="utf-8"?> <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <soap:Body> <Quote xmlns="http://TEST/T"> <lstrRiskXML> <lstrRiskXML xmlns=""> <householdRisk xmlns="http://www.testsystem.com/schemas" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.testsystem.com/schemas TESTHouseholdSchema.xsd"> <insuredParty instance="1"> <client>true</client> <dob>1956-01-01T12:00:00.0000000-00:00</dob> <email>805a6ab24@897ebe196.com</email> <forename>test</forename> <maritalStatusID>M</maritalStatusID> <sex>Male</sex> <surname>Smith</surname> <titleID>003</titleID> </insuredParty> </lstrRiskXML> </lstrRiskXML> </Quote> </soap:Body> </soap:Envelope>

In UltraEdit there is an option ‘XML convert to CR/LFs’ which ‘pretty prints’ complete with indentations it and I was wondering if it’s possible to achieve the same thing in Sublime? If it is, then I’m one step closer to ditching UE entirely.

Another options UE has is ‘convert to fixed column’ which is a sort of column mode that ‘pretties’ or aligns data in comma,space or tab delimited files, padding them out with spaces to make it easier to read:

one,two,three,four,five
six,seven,eight,nine,ten

becomes

one,two  ,three,four,five
six,seven,eight,nine,ten

Any ideas as to how I could do this?

Cheers,
Mick

0 Likes

#2

Hi Mick,

I use the following plugin for tidying XML. If there is a filename in the clipboard it will tidy that file, otherwise it will tidy the active view. It will only tidy a file if it contains well-formed xml (without this restriction I had a problem with clobbering non-xml files). You might want to play with the command line parameters passed to the tidy program it uses.

This works on Windows only; it requires the “tidy” application (from tidy.sourceforge.net). Ideally it would be changed to use one of the python XML-tidy libraries, but I wasn’t sure (1) how to include an external dependency and (2) which library to use.

Anyway, hope it helps.

Cheers,

Josh

'''
@author Josh Bjornson

This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License.
To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/
or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.

Disclaimer:
This comes with no warranty.  Use at your own risk.

Installation:
Download and install tidy.sourceforge.net and update the path below accordingly (TIDY_EXE)

Add the appropriate keyboard shortcuts from "Preferences" \ "User Key Bindings":
{ "keys": "alt+t"], "command": "xml_tidy" }
'''
import sublime, sublime_plugin
import os
from subprocess import Popen

from xml.sax.handler import ContentHandler
from xml.sax import make_parser

TIDY_EXE = 'C:/path/to/tidy.exe'

class XmlTidyCommand(sublime_plugin.TextCommand):

    def run(self, edit):
        """If there is a filename in the clipboard then tidy it, otherwise tidy the current view"""
        clip = sublime.get_clipboard()
        filepath = (clip if os.path.isfile(clip) else self.view.file_name())
        self.tidy_file(filepath)


    def tidy_file(self, filepath):
        """
        Gets the filename of the current view and tries to tidy it.
        The tidy will only be attempted for well-formed XML documents.
        For command-line options, see http://tidy.sourceforge.net/docs/tidy_man.html
        """
        if not os.path.isfile(filepath):
            self.alert('Unable to tidy the current file.  Please save the file first.')
        elif self.is_well_formed(filepath):
            self.alert('Tidying file "%s"' % (filepath))
            Popen('"%s" -q -xml -i -u -m -w 120 "%s"' % (TIDY_EXE, os.path.normpath(filepath)))
        else:
            self.alert('Aborting tidy - the file does not contain well-formed xml ("%s")' % (filepath))


    def is_well_formed(self, filepath):
        """Adopted from: http://code.activestate.com/recipes/52256-check-xml-well-formedness"""
        parser = make_parser()
        parser.setContentHandler(ContentHandler())
        well_formed = True
        try:
            parser.parse(filepath)
        except Exception, e:
            well_formed = False
        return well_formed


    def alert(self, message):
        """Display a status message in Sublime Text and also write to the console"""
        print 'XmlTidy: ' + message
        sublime.status_message(message)
0 Likes

#3

Thanks for that Josh, I’ll give it a try.

Cheers,
Mick

0 Likes

#4

…and here is a plugin for the ‘convert to fixed column’ functionality. The plugin will take each selection and apply the “fixed columns” formatting independently. To apply this to a whole file, just ‘select all’ then run the plugin. If you don’t like the results then an undo will being you back to the previous state.

I’m new to python so there may be more efficient ways to do this, but it seems to work. It was a fun problem to solve and learn a bit more about python. Definitely a cool language.

Cheers,

Josh

'''
@author Josh Bjornson

This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License.
To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/
or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
'''
import sublime_plugin

SPLIT_CHAR = ','

# convert_to_fixed_column
class ConvertToFixedColumnCommand(sublime_plugin.TextCommand):
    def run(self, edit):
        for region in self.view.sel():
            self.view.replace(edit, region, self.align_content(self.view.substr(region)))

    def align_content(self, content):
        # calculate the max width for each column
        lines = ]
        widths = ]
        for text in iter(content.splitlines()):
            line = text.split(SPLIT_CHAR)
            lines.append(line)
            for (cell_idx, cell_val) in enumerate(line):
                if cell_idx >= len(widths):
                    widths.append(0)
                widths[cell_idx] = max(widths[cell_idx], len(cell_val))

        # format each cell to the max width
        output = ]
        for line in lines:
            for col_idx in range(len(line)):
                mask = '%%0%ds' % (widths[col_idx])
                line[col_idx] = mask % line[col_idx]
            output.append(SPLIT_CHAR.join(line))
        # make sure that the trailing newline is saved (if there was one)
        if content.endswith('\n'):
            output.append('')
        return '\n'.join(output)
0 Likes

#5

Many thanks for that, Josh, it’s much appreciated.

Cheers,
Mick

0 Likes