Q&A Session

1. Are the designs required to be open-source?

Yes, per the request from our sponsors and the expectation that the contest will benefit the general community, we do require that all participants to make their designs open-source. This is mandatory if you wish to participate.

2. For the design, if we're using third party IPs that we cannot open-source as we do not own that IP, how does that affect things?

You may use third party IPs as desired, and you will only need to open-source your own design and reference the third party IPs that are used.

3. Are we suppose to travel to DAC to participate in the contest?

No, you do not need to travel to DAC to participate. However, if you are among the top three winners in each category, at least one member of the team is required to register DAC, attend the award ceremony, and give a technical presentation.

4. Our team is from XX (country or region). Can we participate?

This contest is open to world-wide participants. However, please note that our sponsor may have restrictions as to where they can ship the design kit. If you are from a country/region where they cannot ship, you will have to get your own design kit.

5. Can we have more than one team from the same organization?

Yes, you can have as many teams as you wish, but please note that we only ship one design kit in each category to one organization. Multiple teams from the same organization will need to share the design kit.

6. Can we add team members after the registration?

Yes, you can send us an email and update it.

7. Can a person participate in two teams?

Yes, as long as they are not in the same category.

8. Can I register one team for both platforms?

Yes, but you will need to register separately. The team name can be the same or different.

9. Will the training dataset released cover all the situations in your hidden evaluation dataset?

Yes.

10. Will the sponsors take the design kit back after the competition?

This depends on an evaluation of your effort in the contest. At the end of competition we will require each team to submit their complete source code, no matter it works or not. If our sponsors deem that little effort is spent, you will be asked to return the design kit.

11. What are the exact classes that are expected to be detected?

The dataset provided contains all objects that need to be detected. The hidden dataset that we will use in evaluation will contain the same objects as provided. However, you will not be provided which class an image belongs to. Your algorithm should detect it automatically.

12. Is this a single-object detection, i.e., each image will only contain a single object?

Yes.

13. Will you provide the input/output interface? How do we set the 20 fps throughput rate? Is the file read/write time counted?

Yes, we will provide a reference design very soon. You can change the detection algorithm used inside it but please keep the input/output interface the same. The file read/write time will not be counted when deciding the throughput rate. Meanwhile, you can start to write the core algorithm.

14. I'm not seeing any constraint on the latency. Is there any requirement on that?

There is no explicit requirement on latency. However, with 20 fps throughput rate and limited on-chip memory, there is an implicit requirement.

15. Is the 20 fps speed a hard requirement?

Yes, this is what makes the contest interesting (and hard). If your team fails to achieve this target, you will not be ranked.

16. I see different image resolution and some images are not properly labeled. What shall I do?

We have prepared a sanitized dataset and please download the new one.

17. Are we required to use DNN (deep neural network) to approach this or other machine learning methods are also allowed?

You are not required to use DNN - other machine learning tasks are also fine.

18. Are all the images 640x360. I have inspected a small sample which suggests this is true; can you confirm this is the only resolution we need to handle in the entire dataset, and in all evaluations?

Yes, all the images are 640x360.

19. There are several datasets for some categories, eg person1 through person29. Are we supposed to just name them as “person”, or do we have to be specific and say “person3”?

This is only a detection contest not recognition. There is no requirement for recognition.

20. The XML has labels in the training set as <nameperson1</name tags. these labels are not in the example output given on the web page. should the label actually be “person” rather than person1 through person29?

See the answer above.

21. Data sets for person3 and person29, for example, clearly have 2 people in the first few images. likewise, riding has images of multiple bicycles. we have been told to only find 1 object of 1 type in the image. how are we supposed to know which person to identify?

Although there are multiple things are the same kind, but they are different, and only one of them is the object. That is the hard part of the contest.

22. Can we use temporal information from frame-to-frame? are we guaranteed that each test will be a video stream of similar images from start-to-start, where the same object type is to always be identified within that stream? eg, can we do “recognition” once, and then just do “tracking” afterwards?

No, you are not supposed to use the temporal information. We guarantee that the hidden dataset will be from the same videos (so same objects), but the images will be provided in arbitrary order.

23. There are not a lot of variations of each object type, which makes the training difficult. for example, there is only one category for birds, bird1. this will make it difficult to find birds in an arbitrary video sequence. do we only have to find /track birds in frames that belongs to this same video sequence?

See the answer above. The objects will be the same as those in the training dataset provided.

24. The images in bird1 have navigational data overlaid on the image, making it more challenging. in a real drone, I would assume that we would not have this navigational data on the image feed. do we have to handle such occlusions?

These are not interfering with the bounding box so should not matter.

25. the building category looks impossible. how are we to know which building is to be identified? there are no obvious answers when there are hundreds of buildings in view. will the user tag the first one of interest?

It is always the same building that needs to be identified, although it looks similar to others. This is again a “challenging” problem to distinguish top teams.

26. Some cases are incredibly difficult, eg boat9/000559.jpg has a boat that is just 6 pixels wide x 8 pixels high. as a human, i cannot identify it directly, only indirectly through the wake on water. are we expected to use temporal information to help resolve what the object was in the past?

This is among the “challenging” category. You cannot use temporal information.

Back to main page.