#P8141. [ICPC2020 WF] What’s Our Vector, Victor?
[ICPC2020 WF] What’s Our Vector, Victor?
题目背景
ICPC2020 WF N
题目描述
Vector embeddings are a common tool in machine learning systems. Complex real-world concepts (for instance, words in a dictionary) are mapped onto vectors of real numbers. If embeddings are trained properly, related concepts remain close together in the vector space, so questions like "are these two words synonyms?" can be reduced to straightforward mathematical tests.
You have been hired by Ye Olde Ice Cream Shoppe to create an embedding model for flavors, so that when they are out of an ice cream flavor they can recommend related flavors to customers. After training your embedding on six cloud datacenters for four months, you finally had the perfect flavor model! You were ready to revolutionize the ice cream industry on your street and, dare we say it, your neighborhood! Well, until you accidentally dripped some free ice cream on your keyboard and deleted half the data. The Shoppe cannot afford the 30 billion rubles needed to retrain the model, so you are in trouble.
Fortunately, you still have various training results lying around. For a given deleted vector, the training data tells you how close it was to some known flavor vectors. The closeness of vectors and is just the standard Euclidean distance metric (that is, the length of the vector ). Write a tool that reconstructs embeddings which are consistent with the training results.
输入格式
The first line contains two integers and , where () is the number of dimensions of the vectors, and () is the number of training results for a deleted embedding vector you want to reconstruct. Each of the next lines describes a training result using numbers and . In a training result, () are the coordinates of a known vector, and () is the Euclidean distance from that vector to the deleted one.
Your submission will be tested only with the sample inputs given below and inputs generated according to the following procedure. First, and are chosen. Then, input vectors and output vector, each with dimension , are chosen at random. The coordinates are independent and uniformly distributed in the interval . Next, the Euclidean distance from each input vector to the output vector is computed. Finally, the output vector is discarded. Calculations use double-precision floating-point numbers. Numbers obtained using this procedure appear in the input with digits after the decimal point.
输出格式
Output values, the coordinates of any vector that has the given distance to each training vector with an absolute or relative error of at most .
题目大意
【题目描述】
向量嵌入是机器学习系统中的一种常用工具。现实世界中的某些复杂的概念(例如词典中的单词)通过这种方法可以映射到一个实向量上。如果能让机器正确地训练向量嵌入,由于相关概念在向量空间中保持着紧密的联系,因此像“这两个单词是同义词吗?”这种问题可以简化为简单的数学测试了。
你最近被 Ye Olde 冰淇淋专柜受雇,专门负责为冰淇淋口味创建一个向量嵌入式模型,这样当一种口味的冰淇淋卖完以后,这个嵌入式模型就可以像顾客推荐其它味道相近的冰淇淋。在六个云数据中心里训练了四个月以后,你终于拥有了一个完美的模型,正准备好改变你所在街区的冰淇淋行业,乃至整个街区时,你不小心把一些免费的冰淇淋滴到了键盘上,导致一半的数据被删除。重新训练模型需要花费 300 亿卢布,而专柜显然承受不了这么高的价格,因此你陷入了巨大的麻烦之中。幸运的是,你可以通过剩下的数据来还原出被删除的向量。具体地,对于一个给定的被删除的向量,数据会告诉你它与某些已知的味道向量的接近程度。两个向量 的接近程度即为这两个向量之间的标准欧几里得距离(即向量 的长度)。
现在,给定向量所处的空间维度 和已知向量个数 ,并给定所有 个向量的坐标 和到同一个被删除的向量的标准欧几里得距离 ,请你编写一个程序,求出被删除的向量的坐标。
【输入格式】
第一行输入两个整数 ,分别表示向量空间的维度和向量个数。
随后 行,每行 个实数 ,其中前 个实数描述了第 个向量的坐标,第 个实数则表示第 个向量与被删除的向量之间的标准欧几里得距离。
你提交的代码只会在样例输入数据和按以下方式生成的输入数据上进行测试:
- 在区间 中选出两个整数作为 。
- 随机生成 个 维输入向量和一个 维输出向量。这 个坐标在区间 中独立且均匀地生成。
- 对于每个输入向量,计算其与输出向量之间的标准欧几里得距离。
- 忽略输出向量。
所有在上述过程中的计算使用双精度浮点数,用这种方式生成的所有实数均保留到小数点后 位。
【输出格式】
输出 个整数,描述被删除向量的坐标。你需要保证你输出的答案和标准答案之间的误差不会超过 。
【样例】
见『输入输出样例』部分。
【数据范围】
对于所有数据:
- 。
- 。
- 。
Translated by Eason_AC
2 3
0 0 2.5
3 0 2.5
1.5 0.5 2.5
1.5 -2
2 2
0 0 2
4 -4 6
1.414213562373 1.414213562373
4 3
0 1 2 3 2
1 2 -1 7 5
1 0.3 3.4 1.2 3.3
1 2 3 4