# Email to Jonathan Dear Jonathan, Hope you are well. We are currently facing a few issues in our project, in particular: - Issue with setting up our Virtual Machine in running pose estimation program in real-time. - Issue with project direction. ### Virtual Machine Issue As of now, we could not run Pose Estimation in the Virtual Machine (VM) in real-time provided by Lucas. This could be due to unsupported drivers/in-compatible hardware. We are working on this with Lucas for the time being. > [yick] im afraid that the existing architecture may not be able to support real time at all; this is all or nothing no matter how close we seem to get it done. [matt] let me work on pose net and let you know [yick] check slack as well; i hv written about sign language [yick] anything else to add for this email? [matt] let's wait for TszKiu's input [Tsz Kiu] Since I was not focusing on the real-time issue, I am just wondering why the existing architecture would not be able to support real time? [yick] check this out: https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html?fbclid=IwAR0-C0xOxS1WZ6U5j9n0eNV1s5FgequPdvwAstlG8iwdUTF9yzm4rY2ruDs our VM is windows-server-2019, if it's a cross-compiler based OS, then CUDA is not supported at all; this means we could not run openpose on gpu (i.e. no real-time); check this out: https://docs.nvidia.com/cuda/vGPU/index.html?fbclid=IwAR3vG3FB-TUWZrno1-SqyFmsp_j_fCPDnCWwwf5MXHjdb1YFXoYNP1TNzm4 also the VM uses virtual-gpu, it's the same as normal gpu; so the drivers written for the normal gpu may not work with vGPU; in our case, our vgpu is supported, but there are some functionalities excluded. [Tsz Kiu] It seems like CUDA supports native compiler though? What is the difference between native and cross? [yick] whatever that means, idk; im just reading out the specs and point it to lucas; matthew and lucas are in better position to comment on this [yick] idk, that's why it was asked in lucas's email, trying to give him as many information on CUDA so that he could advice us on this. [Tsz Kiu] Well, it says it would work on native compiler. How can you be sure that the existing architecture may not be able to support real time at all? Having said that, I think you are right about the excluded features though? Maybe we can also try looking into whether they are critical to our application? [yick] crap, i think we could hv mentioned explicitly that in the email as well ... [Tsz Kiu] Do you want to send him an extra email? Or would you like to wait for his response first? [yick] actually we did, not in explicit way. [yick] yea, we have to wait his regardless; it remains to answer: switching different architecture (platform) or different applications or both? [yick] atcually i also made some comments below; check it out. [matt] I have emailed the details to Lucas to see if he has any input on the matter. Issue is more on if there is a workaround. [matt] Dont forget that we can still build and run openpose for CPU, may not be as fast, but maybe it will tolerable (have to test again once we can link our webcams to the VM) [matt] Also, openpose cpu is able to run hand recognition as well (as wel as feet! :) ) ### Project Direction Issue Our current project application is about controlling musical parameters with pose estimation: 1. Either developing a self-contained interactive environment of this purpose, 2. or building a pluging to existing audio synthesizing software such as Pure Data and/or SuperCollider. There are two concerns: - It's uncertain that when and whether we are able to set up and run a real-time Pose estimation. - Real-time component is crucial for the project application above. By above, we are considering other project application: "Australian Sign Language (AusLAN) Recognition" for only certain classes of signs. where most of its value will still be retained even if real-time component cannot be achieved as the recognition accuracy is the core. There are some concerns regarding changing our project direction as well: - Since we did not research much on sign language translation, there could be potential issues that we are unaware of. > [yick] gesture-recognition-for-music; auslan, they share the same backend technology (inference); > [yick] on this part, we havent really done much for both ... > [Tsz Kiu] In terms of the technical stuff, not much about gesture-recognition. But in terms of the output and what kind of gesture we want to recognize, I think we (or at least me) have a better picture for music than for sign language. - Although not as important, it is still good to have a real-time component for user-experience. > [yick] i suppose real-time is a "could-have" instead of "should-have" for auslan; > [yick] this has been mentioned above: "value retained"; so is this necessary? > [Tsz Kiu] Whether real-time is a "should-have" or "could-have" depends on the application. For example for emergency sign language recognition, it seems like the application would be useless without real-time. > [yick] yes, true; non-real time actually diminish the value but the core part is the recognition accuracy, Currently, our team is divided on whether: 1. We should stick with the initial application: Musical Parameter Control or 2. Switch to another application (such as AusLAN translation); or any other application you could recommend. Please advice. If necessary, we can set up a quick team meeting to whenever that works with you to further discuss on the issues listed. Any input would be much appreciated. Please let us know if you have any questions. Thank you.