Hi there - I would love to help you out. I've worked a lot on x86/64 (maily on the project dynarmic , an ARM to x86 recompiler, and on marssel(a yet not published Pica200 shader recompiler). I have knowledge of the basic instruction set, SSE, AVX (except AVX-512 but its kinda similar to AVX2), BMI, AES as well as internal microarchitecture mechanisms such as loop buffers, cahe levels (micro instruction cache -> l1 -> l2 -> l3) and controlling such transfers in a multiprocessor enviroment.