<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="es">
	<id>https://wiki.nlhpc.cl/index.php?action=history&amp;feed=atom&amp;title=VLLM_API_con_apptainer</id>
	<title>VLLM API con apptainer - Historial de revisiones</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.nlhpc.cl/index.php?action=history&amp;feed=atom&amp;title=VLLM_API_con_apptainer"/>
	<link rel="alternate" type="text/html" href="https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;action=history"/>
	<updated>2026-04-04T22:27:18Z</updated>
	<subtitle>Historial de revisiones de esta página en la wiki</subtitle>
	<generator>MediaWiki 1.39.3</generator>
	<entry>
		<id>https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=946&amp;oldid=prev</id>
		<title>Administrador: /* Lanzar el servicio vLLM */</title>
		<link rel="alternate" type="text/html" href="https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=946&amp;oldid=prev"/>
		<updated>2025-04-11T16:55:47Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Lanzar el servicio vLLM&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;es&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Revisión anterior&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revisión del 16:55 11 abr 2025&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l39&quot;&gt;Línea 39:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Línea 39:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  ml purge&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  ml purge&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  # ----------------Modulos----------------------------&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  # ----------------Modulos----------------------------&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  ml apptainer&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  ml apptainer&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;/1.3.6-zen4-i&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  # ----------------Comando--------------------------&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  # ----------------Comando--------------------------&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  apptainer exec --rocm --fakeroot --bind /home/ai_inference_db:/home/ai_inference_db --bind /home/$USER:/home/$USER&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  apptainer exec --rocm --fakeroot --bind /home/ai_inference_db:/home/ai_inference_db --bind /home/$USER:/home/$USER&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Administrador</name></author>
	</entry>
	<entry>
		<id>https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=945&amp;oldid=prev</id>
		<title>Administrador: /* Cargar módulo apptainer */</title>
		<link rel="alternate" type="text/html" href="https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=945&amp;oldid=prev"/>
		<updated>2025-04-11T16:55:34Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Cargar módulo apptainer&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;es&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Revisión anterior&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revisión del 16:55 11 abr 2025&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l18&quot;&gt;Línea 18:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Línea 18:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Para utilizar la imagen vLLM con compatibilidad AMD carga el módulo correspondiente:&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Para utilizar la imagen vLLM con compatibilidad AMD carga el módulo correspondiente:&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  ml apptainer&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  ml apptainer&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;/1.3.6-zen4-i&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Lanzar el servicio vLLM ==&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Lanzar el servicio vLLM ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Administrador</name></author>
	</entry>
	<entry>
		<id>https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=896&amp;oldid=prev</id>
		<title>Eosorio: /* Otros Enlaces */</title>
		<link rel="alternate" type="text/html" href="https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=896&amp;oldid=prev"/>
		<updated>2025-04-07T22:05:11Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Otros Enlaces&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;es&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Revisión anterior&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revisión del 22:05 7 abr 2025&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l183&quot;&gt;Línea 183:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Línea 183:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[OLLAMA API]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[OLLAMA API]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;VLLM API con módulos de software&lt;/del&gt;]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[ &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Diffusers &lt;/ins&gt;]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;VLLM &lt;/del&gt;API con &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;apptainer]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[ &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;vLLM &lt;/ins&gt;API con &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;módulos de software &lt;/ins&gt;]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[ Diffusers &lt;/del&gt;]]&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Eosorio</name></author>
	</entry>
	<entry>
		<id>https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=890&amp;oldid=prev</id>
		<title>Eosorio: /* Otros Enlaces */</title>
		<link rel="alternate" type="text/html" href="https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=890&amp;oldid=prev"/>
		<updated>2025-04-07T21:39:52Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Otros Enlaces&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;es&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Revisión anterior&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revisión del 21:39 7 abr 2025&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l182&quot;&gt;Línea 182:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Línea 182:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[OLLAMA API]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[OLLAMA API]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[VLLM API con módulos de software]]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[VLLM API con apptainer]]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[ Diffusers ]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[ Diffusers ]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[ vLLM API con módulos de software ]]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Eosorio</name></author>
	</entry>
	<entry>
		<id>https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=888&amp;oldid=prev</id>
		<title>Eosorio: /* Otros Enlaces */</title>
		<link rel="alternate" type="text/html" href="https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=888&amp;oldid=prev"/>
		<updated>2025-04-07T21:22:50Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Otros Enlaces&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;es&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Revisión anterior&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revisión del 21:22 7 abr 2025&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l183&quot;&gt;Línea 183:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Línea 183:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[OLLAMA API]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[OLLAMA API]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;vLLM API con módulos de software&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[ Diffusers ]]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[ &lt;/ins&gt;vLLM API con módulos de software &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;]]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Eosorio</name></author>
	</entry>
	<entry>
		<id>https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=883&amp;oldid=prev</id>
		<title>Eosorio: /* Otros Enlaces */</title>
		<link rel="alternate" type="text/html" href="https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=883&amp;oldid=prev"/>
		<updated>2025-04-07T21:19:00Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Otros Enlaces&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;es&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Revisión anterior&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revisión del 21:19 7 abr 2025&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l180&quot;&gt;Línea 180:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Línea 180:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Otros Enlaces ==&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Otros Enlaces ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[OLLAMA API]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[OLLAMA API]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;vLLM API con módulos de software&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;vLLM API con módulos de software&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Eosorio</name></author>
	</entry>
	<entry>
		<id>https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=882&amp;oldid=prev</id>
		<title>Eosorio en 21:18 7 abr 2025</title>
		<link rel="alternate" type="text/html" href="https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=882&amp;oldid=prev"/>
		<updated>2025-04-07T21:18:25Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;a href=&quot;https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;amp;diff=882&amp;amp;oldid=881&quot;&gt;Mostrar los cambios&lt;/a&gt;</summary>
		<author><name>Eosorio</name></author>
	</entry>
	<entry>
		<id>https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=881&amp;oldid=prev</id>
		<title>Eosorio: Página creada con «Introducción  En el contexto del NLHPC, se ofrecen dos herramientas principales para desplegar y realizar inferencia con LLMs: Ollama y vLLM. La elección entre ambas dependerá del formato del modelo y de los requisitos específicos de la implementación:      Ollama: Recomendado para utilizar modelos cuantizados. Recomendado para usuarios  que no estén familiarizados con el despliegue de LLM’s dado la facilidad de uso que presenta la herramienta.      vLLM: Idea…»</title>
		<link rel="alternate" type="text/html" href="https://wiki.nlhpc.cl/index.php?title=VLLM_API_con_apptainer&amp;diff=881&amp;oldid=prev"/>
		<updated>2025-04-07T20:59:40Z</updated>

		<summary type="html">&lt;p&gt;Página creada con «Introducción  En el contexto del NLHPC, se ofrecen dos herramientas principales para desplegar y realizar inferencia con LLMs: Ollama y vLLM. La elección entre ambas dependerá del formato del modelo y de los requisitos específicos de la implementación:      Ollama: Recomendado para utilizar modelos cuantizados. Recomendado para usuarios  que no estén familiarizados con el despliegue de LLM’s dado la facilidad de uso que presenta la herramienta.      vLLM: Idea…»&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Página nueva&lt;/b&gt;&lt;/p&gt;&lt;div&gt;Introducción&lt;br /&gt;
&lt;br /&gt;
En el contexto del NLHPC, se ofrecen dos herramientas principales para desplegar y realizar inferencia con LLMs: Ollama y vLLM. La elección entre ambas dependerá del formato del modelo y de los requisitos específicos de la implementación:&lt;br /&gt;
&lt;br /&gt;
    Ollama: Recomendado para utilizar modelos cuantizados. Recomendado para usuarios  que no estén familiarizados con el despliegue de LLM’s dado la facilidad de uso que presenta la herramienta.&lt;br /&gt;
&lt;br /&gt;
    vLLM: Ideal para ejecutar modelos descargados desde Hugging Face en formato .safetensors, ofreciendo un alto rendimiento y eficiencia. Recomendado a usuarios con  experiencia en el d“espliegue de LLM’s, dado la gran cantidad de parámetros ajustables que presenta la herramienta.&lt;br /&gt;
&lt;br /&gt;
vLLM es un motor de inferencia y servicio de LLM’s, diseñado para ofrecer alto rendimiento y eficiencia en memoria. Su arquitectura optimizada permite ejecutar modelos de lenguaje a gran escala con aceleración GPU maximizando la respuesta en inferencias. Sin embargo, su uso es complicado dada la cantidad de parámetros a tomar en cuenta para su correcta y eficiente ejecución.&lt;br /&gt;
&lt;br /&gt;
En el siguiente artículo se le enseñará a:&lt;br /&gt;
&lt;br /&gt;
    Desplegar el servicio de vLLM en el cluster utilizando apptainer.&lt;br /&gt;
&lt;br /&gt;
    Conectarse a la API del servicio desplegado desde su computadora local.&lt;br /&gt;
&lt;br /&gt;
    Utilizar la API para realizar cargas de trabajo de inferencia.&lt;br /&gt;
&lt;br /&gt;
Cargar módulo apptainer&lt;br /&gt;
&lt;br /&gt;
Para utilizar la imagen vLLM con compatibilidad AMD carga el módulo correspondiente:&lt;br /&gt;
ml apptainer&lt;br /&gt;
Lanzar el servicio vLLM&lt;br /&gt;
&lt;br /&gt;
Ejemplo de script SBATCH para iniciar el servicio:&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#---------------Script SBATCH - NLHPC ----------------&lt;br /&gt;
#SBATCH -J vllm_serve&lt;br /&gt;
#SBATCH -p mi210&lt;br /&gt;
#SBATCH --gres=gpu:1&lt;br /&gt;
#SBATCH -n 1&lt;br /&gt;
#SBATCH --ntasks-per-node=1&lt;br /&gt;
#SBATCH -c 8&lt;br /&gt;
#SBATCH -t 0:10:00&lt;br /&gt;
#SBATCH --mem=4090MB&lt;br /&gt;
#SBATCH -o logs/vllm_serve_%j.out&lt;br /&gt;
#SBATCH -e logs/vllm_serve_%j.err&lt;br /&gt;
#-----------------Toolchain--------------------------&lt;br /&gt;
ml purge&lt;br /&gt;
# ----------------Modulos----------------------------&lt;br /&gt;
ml apptainer&lt;br /&gt;
# ----------------Comando--------------------------&lt;br /&gt;
apptainer exec --rocm --fakeroot --bind /home/ai_inference_db:/home/ai_inference_db --bind&lt;br /&gt;
/home/$USER:/home/$USER&lt;br /&gt;
$IMAGES/rocm6.3.2_mi210_ubuntu22.04_py3.12_vllm_0.7.1.dev103_ib.sif /bin/bash -c “&lt;br /&gt;
export HF_HOME=/home/ai_inference_db/models/&lt;br /&gt;
export HF_DATASETS_CACHE=/home/ai_inference_db/data/&lt;br /&gt;
vllm serve gpt2 --served-model-name gpt2 --distributed-executor-backend mp&lt;br /&gt;
--tensor-parallel-size 1 --host 0.0.0.0&lt;br /&gt;
⚠️ Consideraciones importantes&lt;br /&gt;
&lt;br /&gt;
    Terminar el servicio de vLLM&lt;br /&gt;
&lt;br /&gt;
El servicio de vLLM no termina automáticamente, recuerde siempre cancelar la tarea con scancel para evitar subutilización de recursos de cómputo&lt;br /&gt;
&lt;br /&gt;
    Memoria RAM&lt;br /&gt;
&lt;br /&gt;
La cantidad de RAM que se está reservando debe ser acorde a la cantidad de memoria que&lt;br /&gt;
necesita el modelo a lanzar. Se puede utilizar la siguiente fórmula para estimar el peso en&lt;br /&gt;
GB&lt;br /&gt;
&lt;br /&gt;
    &amp;lt;cantidad de parámetros&amp;gt; * &amp;lt;cantidad de bits&amp;gt; / (8 * 10 ^ 9)&lt;br /&gt;
&lt;br /&gt;
Por ejemplo, si su modelo tiene 14 billones de parámetros y está cuantizado a 4 bits, entonces requerirá 7GB de memoria para lanzarse, por tanto debe reservar siete o más gigabytes de memoria.&lt;br /&gt;
&lt;br /&gt;
    Reserva de GPU’s&lt;br /&gt;
&lt;br /&gt;
Cada GPU MI210 tiene 64 GB de memoria VRAM. Si su modelo utiliza más memoria,&lt;br /&gt;
entonces solicitar más GPU’s adicionales con --gres.&lt;br /&gt;
&lt;br /&gt;
También debe agregar la flag --tensor-parallel-size &amp;lt;NUM_GPUS&amp;gt; a vllm serve.&lt;br /&gt;
&lt;br /&gt;
Notar que, al solicitar más GPU’s el rendimiento en tokens/s no mejora. Además, se&lt;br /&gt;
recomienda pedir 8 CPU’s por GPU solicitada.&lt;br /&gt;
&lt;br /&gt;
    Flags de ejecución:&lt;br /&gt;
&lt;br /&gt;
vLLM cuenta con una variada gama de flags para lanzar su servicio. A continuación, la guía&lt;br /&gt;
oficial: OpenAI-Compatible Server — vLLM&lt;br /&gt;
&lt;br /&gt;
    Puerto de escucha&lt;br /&gt;
&lt;br /&gt;
El puerto por defecto es 8000. Se recomienda utilizar otro (ej. 1journal1434) para evitar utilizar un puerto ya desplegado.&lt;br /&gt;
&lt;br /&gt;
Se especifica con:&lt;br /&gt;
--port &amp;lt;Num_puerto&amp;gt;&lt;br /&gt;
&lt;br /&gt;
al comando vllm serve.&lt;br /&gt;
&lt;br /&gt;
    Directorio de descarga de modelos&lt;br /&gt;
&lt;br /&gt;
Los modelos se descargan automáticamente de HuggingFace Hub en el directorio /home/ai_inference_db/models/. Para usar y descargar modelos en su carpeta de usuario, cambie las variables de entorno:&lt;br /&gt;
unset HF_HOME&lt;br /&gt;
unset HF_DATASETS_CACHE&lt;br /&gt;
&lt;br /&gt;
Ademas, si su modelo requiere permisos para utilizarse, puede realizar login a su cuenta&lt;br /&gt;
utilizando:&lt;br /&gt;
huggingface-cli login&lt;br /&gt;
&lt;br /&gt;
Por otro lado, puede comunicarse con el soporte de NLHPC vía tickets para solicitar el&lt;br /&gt;
grupo &amp;lt;practica-gpu&amp;gt; y descargar modelos en la carpeta compartida.&lt;br /&gt;
&lt;br /&gt;
    Uso de modelos propios&lt;br /&gt;
&lt;br /&gt;
Para utilizar un modelo propio o fine tunned, se debe especificar la ruta al archivo de su&lt;br /&gt;
modelo agregando la flag:&lt;br /&gt;
--model /ruta/a/tu/modelo&lt;br /&gt;
&lt;br /&gt;
al comando vllm serve.&lt;br /&gt;
&lt;br /&gt;
Notar que, si utiliza .gguf, el modelo debe estar en un único archivo y no dividido por partes.&lt;br /&gt;
Identificar el nodo de ejecución&lt;br /&gt;
&lt;br /&gt;
Usa squeue para encontrar el nodo asignado:&lt;br /&gt;
[usuario@leftraru2 ~]$ squeue&lt;br /&gt;
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)&lt;br /&gt;
38922125 mi210 vllm_serve usuario R 0:03 1 gn004&lt;br /&gt;
&lt;br /&gt;
Importante recordarlo pues lo usaremos más tarde.&lt;br /&gt;
Creación de túnel de acceso&lt;br /&gt;
&lt;br /&gt;
Sabiendo el nodo en que ejecutó nuestra tarea. Crearemos un túnel de acceso desde el cluster hacia nuestra computadora. Para esto, en una nueva terminal, ejecutar cambiando los valores:&lt;br /&gt;
ssh usuario@leftraru.nlhpc.cl -p 4603 -L [puerto_local]:[nodo]:[puerto_vllm]&lt;br /&gt;
&lt;br /&gt;
por ejemplo:&lt;br /&gt;
ssh intern02@leftraru.nlhpc.cl -p 4603 -L 4466:gn004:8000&lt;br /&gt;
&lt;br /&gt;
Con esto podremos utilizar el &amp;lt;puerto_local&amp;gt; para lanzar consultas a la API de vLLM&lt;br /&gt;
Ejemplo de uso&lt;br /&gt;
&lt;br /&gt;
Una vez realizado el túnel de acceso, podemos utilizar la API de vLLM en nuestra computadora personal, utilizando herramientas como Python o Curl. &lt;br /&gt;
&lt;br /&gt;
He aqui una guia de uso para la API de vLLM.&lt;br /&gt;
&lt;br /&gt;
A continuación unos ejemplos:&lt;br /&gt;
Python:&lt;br /&gt;
import requests, json&lt;br /&gt;
url = &amp;quot;&amp;lt;http://localhost:[puerto_local]/v1/completions&amp;gt;&amp;quot;&lt;br /&gt;
headers = {&amp;quot;Content-Type&amp;quot;: &amp;quot;application/json&amp;quot;}&lt;br /&gt;
payload = {&lt;br /&gt;
&amp;quot;model&amp;quot;: &amp;quot;[nombre_modelo]&amp;quot;,&lt;br /&gt;
&amp;quot;prompt&amp;quot;: &amp;quot;[tu_prompt]&amp;quot;,&lt;br /&gt;
&amp;quot;max_tokens&amp;quot;: 100,&lt;br /&gt;
&amp;quot;temperature&amp;quot;: 0.7&lt;br /&gt;
}&lt;br /&gt;
response = requests.post(url, json=payload, headers=headers)&lt;br /&gt;
data = response.json()&lt;br /&gt;
print(data[&amp;quot;choices&amp;quot;][0][&amp;quot;text&amp;quot;])&lt;br /&gt;
CURL:&lt;br /&gt;
curl -X POST &amp;quot;&amp;lt;http://localhost:[puerto_local]/v1/completions&amp;gt;&amp;quot; \&lt;br /&gt;
-H &amp;quot;Content-Type: application/json&amp;quot; \&lt;br /&gt;
-d &amp;#039;{&lt;br /&gt;
&amp;quot;model&amp;quot;: &amp;quot;[nombre_modelo]&amp;quot;,&lt;br /&gt;
&amp;quot;prompt&amp;quot;: &amp;quot;[tu_prompt]&amp;quot;,&lt;br /&gt;
&amp;quot;max_tokens&amp;quot;: 100,&lt;br /&gt;
&amp;quot;temperature&amp;quot;: 0.7&lt;br /&gt;
}&amp;#039;&lt;br /&gt;
TroubleShooting&lt;br /&gt;
¿Por qué el modelo no carga?&lt;br /&gt;
&lt;br /&gt;
Pueden haber múltiples razones:&lt;br /&gt;
&lt;br /&gt;
    La cantidad de GPU’s solicitadas no es divisor de la cantidad de cabezas de atención del modelo. Se recomienda que la cantidad de GPU’s sea potencia de 2.&lt;br /&gt;
&lt;br /&gt;
    El modelo es demasiado grande y no cabe en la cantidad de GPU’s solicitadas. Se recomienda ver la documentación del modelo en hugging face y calcular cuanto pesa. Posteriormente aumentar la cantidad de GPU’s solicitadas.&lt;br /&gt;
&lt;br /&gt;
    El modelo no está descargado en la carpeta compartida /home/ai_inference/db/models. Cambiar la carpeta utilizando unset HF_HOME y unset HF_CACHE. Pedir el grupo \&amp;lt;practica-gpu\&amp;gt; al soporte NLHPC para descargar modelos en la carpeta compartida.&lt;br /&gt;
&lt;br /&gt;
¿Puedo utilizar modelos cuantizados en vLLM?&lt;br /&gt;
&lt;br /&gt;
Es posible, utilizando la flag --model /ruta/a/tu/modelo es posible cargar un LLM en formato gguf (cuantizado). Sin embargo debe ser un único archivo. &lt;br /&gt;
&lt;br /&gt;
No es lo recomendado dado que vLLM está optimizado para realizar inferencia con&lt;br /&gt;
precisión completa o media.&lt;br /&gt;
&lt;br /&gt;
== Otros Enlaces ==&lt;br /&gt;
[[OLLAMA API]]&lt;br /&gt;
vLLM API con módulos de software&lt;/div&gt;</summary>
		<author><name>Eosorio</name></author>
	</entry>
</feed>