赞
踩
docker , springboot ,dmidecode (主板序列ID ....), nvidia-smi (GPU ...),System.getProperty(key)
项目 需要获取服务器信息(主板序列号,GPU ,CPU 等相关信息,验证设备是否被授权);
项目部署环境,采用 docker容器部署,sprignBoot 框架开发 ,服务器为 Linux
读取设备硬件信息:我们可以通过执行命令得到期望的结果;window 可以 执行 .bat
linux 可以通过 sh 执行脚本 ,在Linux 中 dmidecode 、nvidia-smi 可以帮助我们获取硬件信息;
通过docker 部署的系统,执行 脚本,是在容器内,想要得到在 服务器中的 直接执行 效果,该怎么做?
Runtime.getRuntime().exec(new String[] {"echo" ,"1234"}) 相当于 在服务器输入以下命令:
docker exec -it dockerContainerName echo "1234";
绑定 nvidia-smi 时,只绑定了 /usr/bin/nvidia-smi ,执行命令的相关文件未绑定,
通过whereis 找到对应文件文件进行绑定即可;
绑定 /usr/lib/x86_64-linux-gnu 就会出现这个问题,有一定的概率不会导致失败
解决方案 : 通过读取服务文件直接获取 信息,而不是通过执行命令;
/sys/class/dmi/id/board_serial # 主板序列目录
/proc/driver/nvidia/gpus # gpuId 目录
public static String getBoardSN() { String cmd = "cat /sys/class/dmi/id/board_serial"; Process p = RuntimeUtil.exec(cmd); return RuntimeUtil.getResult(p); } public static String getSerialNumberBySysFile() { try { List<String> result = IOUtils.readLines(new FileInputStream("/sys/class/dmi/id/board_serial"), "UTF-8"); return result.size() > 0 ? (String)result.get(0) : "error"; } catch (IOException var1) { log.error("get sn error"); return ""; } } public static String getGpuIdsBySysFile() { try { Path rootPath = Paths.get("/proc/driver/nvidia/gpus"); List<File> informationList = FileUtil.loopFiles(rootPath, 2, pathname -> !pathname.isHidden() && pathname.getName().contains("information")); String gpuMark = "GPU UUID:"; String gpuResultStr = ""; if(CollUtil.isNotEmpty(informationList)){ for(File infoFile: informationList){ List<String> result = IOUtils.readLines(new FileInputStream(infoFile.getPath()), "UTF-8"); if(result.size() > 2){ String gpuLine = result.get(2); int index = gpuLine.indexOf(gpuMark); gpuResultStr = gpuResultStr + gpuLine.substring(index + gpuMark.length() + 1).trim() +","; } } } return gpuResultStr; } catch (IOException var1) { log.error("get gpu error"); return ""; } }
whereis 命令可以查看 应用、文件位置 如 : whereis dmidecode
services: my_test_services_api: image: my_remote_storage:6565/my_test_services_api:e99e0457 healthcheck: test: if mountpoint -q /data ;then echo "mounted" ;else kill 1 ;fi interval: 120s timeout: 3s start_period: 40s labels: group: "my_test_services_api" environment: - TZ=Asia/Shanghai privileged: true # 设置权限为 root ports: - 2080:8080 # 端口映射 networks: - local command: java -XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=128m -Xms1024m -Xmx1024m -Xmn256m -Xss256k -XX:SurvivorRatio=8 -XX:+UseConcMarkSweepGC -jar my_test_services_api.jar restart: always volumes: - type: bind source: ./my_test_services_api.conf target: /opt/my_test_services_api/config/application.yml - type: bind source: /data1/log/my_test_services_api target: /opt/my_test_services_api/logs - type: bind source: /data1/data target: /data - type: bind source: /var/run/docker.sock target: /var/run/docker.sock - type: bind source: ./docker-java.properties target: /docker-java.properties - type: bind # dmidecode 绑定 source: /usr/sbin/dmidecode target: /usr/sbin/dmidecode - type: bind source: /dev/mem target: /dev/mem - type: bind # nvidea-smi 绑定 source: /usr/bin/nvidia-smi target: /usr/bin/nvidia-smi - type: bind source: /usr/lib/x86_64-linux-gnu target: /usr/lib/x86_64-linux-gnu
通过System.getProperty 判断服务器类型,本文只实现了 linux 下主板信息和 GPU 信息获取
import lombok.extern.slf4j.Slf4j; import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; @Slf4j public class MachineUtil { public static String getBoard_Series_No_linux(){ String osName = System.getProperty("os.name"); log.info("system os name :{}" ,osName); String result = ""; if(!osName.startsWith("Mac OS") && !osName.startsWith("Windows")){ String CPU_ID_CMD = "dmidecode"; BufferedReader bufferedReader = null; Process p = null; try { p = Runtime.getRuntime() .exec(new String[] {CPU_ID_CMD ,"-s" ,"baseboard-serial-number" });// 管道 p.waitFor(); bufferedReader = new BufferedReader(new InputStreamReader(p.getInputStream())); String line = null; while ((line = bufferedReader.readLine()) != null) { result = result + line.trim(); log.info("result :{}" ,result); } } catch (IOException | InterruptedException e) { log.error("获取cpu信息错误", e); }finally { closeIoStream(bufferedReader, p); } } return result.trim(); } public static String getGpuIds_linux() { String osName = System.getProperty("os.name"); String result = ""; if(!osName.startsWith("Mac OS") && !osName.startsWith("Windows")){ String CPU_ID_CMD = "nvidia-smi"; BufferedReader bufferedReader = null; Process p = null; try { p = Runtime.getRuntime().exec(new String[] {CPU_ID_CMD ,"-L"});// 管道 p.waitFor(); bufferedReader = new BufferedReader(new InputStreamReader(p.getInputStream())); String line = null; int index = -1; while ((line = bufferedReader.readLine()) != null) { index = line.toLowerCase().indexOf("uuid"); if (index >= 0) { // 取出GPU UUID 并去除2边空格 result = result + line.substring(index + "uuid".length() + 1).trim() +","; } } } catch (IOException | InterruptedException e) { log.error("获取Gpu信息错误", e); }finally { closeIoStream(bufferedReader, p); } } return result.replace(")","").trim(); } private static void closeIoStream(BufferedReader br, Process pro){ if(br != null) { try { br.close(); } catch (IOException e) { } } if(pro != null) { pro.destroy(); } } }
可查看硬件相关信息有: bios, system, baseboard, chassis, processor,
memory, cache, connector, slot
watch -n 2 nvidia-smi // 每间隔 2秒 刷新一下
GPU: GPU编号 有多块显卡的时候,从0开始编号
Fan:风扇转速(0%-100%),N/A表示没有风扇
Name:GPU类型
Temp:GPU的温度
Perf:GPU的性能状态,从P0(最大性能)到P12(最小性能)
Persistence-M:持续模式的状态,持续模式虽然耗能大,但是在新的GPU应用启动时花费的时间更少
Pwr:Usager/Cap:能耗表示,Usage:用了多少,Cap总共多少
Bus-Id:GPU总线相关显示,domain:bus:device.function
Disp.A:Display Active ,表示GPU的显示是否初始化
Memory-Usage:显存使用率
Volatile GPU-Util:GPU使用率
Uncorr. ECC:关于ECC的东西,是否开启错误检查和纠正技术,0/disabled,1/enabled
Compute M:计算模式,0/DEFAULT,1/EXCLUSIVE_PROCESS,2/PROHIBITED
Processes:显示每个进程占用的显存使用率、进程号、占用的哪个GPU
root/master:~ # nvidia-smi -L
GPU 0: GeForce RTX 3090 (UUID: GPU-****32a0-****-****-****-5************)
GPU 1: GeForce RTX 3090 (UUID: GPU-****e20b-****-****-****-c36********)
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。