Image Classification & Object Detection

# Image Classification & Object Detection ###### tags: `黃仲璿` `2021/07/15` ## Image Classification Image classification is a process of classifying what the image is __as a whole__. This will only output __one__ recognition result per image. ### Classifying a Single Image `$ ./imagenet.py --network=<netname> <input filename> <output filename>` ex. `$ ./imagenet.py --network=resnet-18 images/jellyfish.jpg images/test/output_jellyfish.jpg` Optional arguments: ``` --network <network name> # chose which network model to use # default is SSD-Mobilenet-v2 --threshold #value <a float between 0~1> # minimum threshold for detection confidence. Default is 0.5 ``` ### Live video Recognizing `$ ./imagenet.py /dev/video0` *Note: This doesn't work on remote desktop mode, has to be operated locally.* ## Object Detection with DetectNet Object detection will dectect __multiple objects__ the network model can identify in one image, thus showing __multiple__ recognition results per image. ### Detect Objects from Images `$ ./detectnet.py <input filename> <output filename>` ex. `$ ./detectnet.py --network=ssd-mobilenet-v2 images/peds_0.jpg images/test/output.jpg` Optional arguments: ``` --network <network name> # chose which network model to use # default network is SSD-Mobilenet-v2 --overlay <flags seperated by ','> # select the way to present recognized objects. # default is --overlay=box, label, conf # box = box coloring; label = object name; conf = confidence --threshold #value <a float between 0~1> # minimum threshold for detection confidence. Default is 0.5 ``` #### Detectnet.py Source Code Notes `argument`: The variables inserted after "./detectnet.py" *Note: Arguments with `--` in front is optional* `opt = parser.parse_known_args()[0]` : Getting all arguments `output.Render(img)`: Put all overlay ontop of `img` then assign it to `output` ```python= import jetson.inference import jetson.utils import argparse import sys # parse the command line parser = argparse.ArgumentParser(description="Locate objects in a live camera stream using an object detection DNN.") # argument = variables to insert after "./detectnet.py" parser.add_argument("input_URI", type=str, default="", nargs='?', help="URI of the input stream") parser.add_argument("output_URI", type=str, default="", nargs='?', help="URI of the output stream") parser.add_argument("--network", type=str, default="ssd-mobilenet-v2", help="pre-trained model to load (see below for options)") parser.add_argument("--overlay", type=str, default="box,labels,conf", help="detection overlay flags (e.g. --overlay=box,labels,conf)\nvalid combinations are: 'box', 'labels', 'conf', 'none'") parser.add_argument("--threshold", type=float, default=0.5, help="minimum detection threshold to use") try: opt = parser.parse_known_args()[0] except: print("") parser.print_help() sys.exit(0) # load the object detection network net = jetson.inference.detectNet(opt.network, sys.argv, opt.threshold) # create video sources & outputs input = jetson.utils.videoSource(opt.input_URI, argv=sys.argv) output = jetson.utils.videoOutput(opt.output_URI, argv=sys.argv) # process frames until the user exits while True: # capture the next image img = input.Capture() # detect objects in the image (with overlay) detections = net.Detect(img, overlay=opt.overlay) # print the detections print("detected {:d} objects in image".format(len(detections))) for detection in detections: print(detection) # render the image # render = put all overlay ontop of original image then assign it to output output.Render(img) # update the title bar output.SetStatus("{:s} | Network {:.0f} FPS".format(opt.network, net.GetNetworkFPS())) # print out performance info net.PrintProfilerTimes() # exit on input/output EOS if not input.IsStreaming() or not output.IsStreaming(): break ```