0

I'm by far not a Python3 hero, but focussed on learning some new skills with it, thus any help would be appreciated. Working on a personal project that I want to throw on GitHub later on, I run into having a command outputting the following Python dictionary:

{'masscan': {'command_line': 'masscan -oX - 192.168.0.131/24 -p 22,80 --max-rate=1000', 'scanstats': {'timestr': '2022-03-26 10:00:07', 'elapsed': '12', 'uphosts': '2', 'downhosts': '0', 'totalhosts': '2'}}, 'scan': {'192.168.0.254': {'tcp': {80: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}, 22: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}}}}}

I then want to parse that to the following JSON format:

{
"data": [
    {
        "{#PORT}": 80,
        "{#STATE}": "OPEN",
        "{#ENDTIME}": "1648285195"
    },
    {
        "{#PORT}": 22,
        "{#STATE}": "Interface #2",
        "{#ENDTIME}": "1648285195"
    }
]
}  

What would be the most efficient way to parse through it? I don't want it to end up in a file but keep it within my code preferrably. Keeping in mind that there might be more ports than just port 22 and 80. The dictionary might be a lot longer, but following the same format.

Thanks!

3
  • Do you mean json.dumps? 2 hours ago
  • Just a JSON dump wouldn't get me where I need to be. As then I don't end up with the right formatting. Big issue being that the port number (22,80 or whatever) does not have a key/value kind of setup in the dictionary. 2 hours ago
  • 1
    Welcome to Stack Overflow! You seem to be asking for someone to write some code for you. Stack Overflow is a question and answer site, not a code-writing service. Please see here to learn how to write effective questions
    – azro
    1 hour ago

3 Answers 3

0

this function will return exactly what you want (i suppose):

def parse_data(input):
    data = []
    for ip in input['scan'].keys():
        for protocol in input['scan'][ip].keys():
            for port in input['scan'][ip][protocol].keys():
                port_data = {"{#PORT}": port, "{#STATE}": input['scan'][ip][protocol][port]['state'].upper(), "{#ENDTIME}": input['scan'][ip][protocol][port]['endtime']}
                data.append(port_data)
    return {'data': data} 

function returns (ouput):

    {
   "data":[
      {
         "{#PORT}":80,
         "{#STATE}":"OPEN",
         "{#ENDTIME}":"1648285195"
      },
      {
         "{#PORT}":22,
         "{#STATE}":"OPEN",
         "{#ENDTIME}":"1648285195"
      }
   ]
}

don't know where 'Interface #2' in port '22' 'state' came from (in your desired result).

2
  • 1
    remove the ; in the line data.append(port_data);
    – D.L
    1 hour ago
  • Very elegant solution! Thanks a lot! I did not know we could loop through a dictionary like this to simply append the data to the new format. 33 mins ago
0

Possible solution is the following:

log_data = {'masscan': {'command_line': 'masscan -oX - 192.168.0.131/24 -p 22,80 --max-rate=1000', 'scanstats': {'timestr': '2022-03-26 10:00:07', 'elapsed': '12', 'uphosts': '2', 'downhosts': '0', 'totalhosts': '2'}}, 'scan': {'192.168.0.254': {'tcp': {80: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}, 22: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}}}}}

result = {"data": []}

for k, v in dct['scan'].items():
    for tcp, tcp_data in v.items():
        for port, port_data in tcp_data.items():
            data = {"{#PORT}": port, "{#STATE}": port_data['state'], "{#ENDTIME}": port_data['endtime']}
            result["data"].append(data)
            
print(result)

Prints

{'data': [
    {'{#PORT}': 80, '{#STATE}': 'open', '{#ENDTIME}': '1648285195'},
    {'{#PORT}': 22, '{#STATE}': 'open', '{#ENDTIME}': '1648285195'}]}
0

You could do a recursive search for the 'tcp' key and go from there. Something like this:

mydict = {'masscan': {'command_line': 'masscan -oX - 192.168.0.131/24 -p 22,80 --max-rate=1000', 'scanstats': {'timestr': '2022-03-26 10:00:07', 'elapsed': '12', 'uphosts': '2', 'downhosts': '0', 'totalhosts': '2'}},
          'scan': {'192.168.0.254': {'tcp': {80: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}, 22: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}}}}}


def findkey(d, k):
    if k in d:
        return d[k]
    for v in d.values():
        if isinstance(v, dict):
            if r := findkey(v, k):
                return r


rdict = {'data': []}
for k, v in findkey(mydict, 'tcp').items():
    rdict['data'].append(
        {'{#PORT}': k, '{#STATE}': v['state'].upper(), '{#ENDTIME}': v['endtime']})


print(rdict)

Output:

{'data': [{'{#PORT}': 80, '{#STATE}': 'OPEN', '{#ENDTIME}': '1648285195'}, {'{#PORT}': 22, '{#STATE}': 'OPEN', '{#ENDTIME}': '1648285195'}]}

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.