universidad de chile facultad de ciencias físicas y matem´aticas

UNIVERSIDAD DE CHILE FACULTAD DE CIENCIAS FÍSICAS Y MATEMÁTICAS DEPARTAMENTO DE INGENIERÍA MATEMÁTICA ALGORITMOS DE APROXIMACIÓN PARA PROBLEMAS DE PROGRAMACIÓN DE ÓRDENES EN MÁQUINAS PARALELAS MEMORIA PARA OPTAR AL TÍTULO DE INGENIERO CIVIL MATEMÁTICO JOSÉ CLAUDIO VERSCHAE TANNENBAUM PROFESOR GUÍA: JOSÉ RAFAEL CORREA HAEUSSLER MIEMBROS DE LA COMISIÓN: MARCOS ABRAHAM KIWI KRAUSKOPF ROBERTO MARIO COMINETTI COTTI-COMETTI SANTIAGO DE CHILE AGOSTO 2008 RESUMEN DE LA MEMORIA PARA OPTAR AL TÍTULO DE INGENIERO CIVIL MATEMÁTICO POR: JOSÉ C. VERSCHAE T. FECHA: 06/08/2008 PROF. GUÍA: Sr. JOSÉ R. CORREA “ALGORITMOS DE APROXIMACIÓN PARA PROBLEMAS DE PROGRAMACIÓN DE ÓRDENES EN MÁQUINAS PARALELAS” El presente trabajo de tı́tulo tuvo como objetivo estudiar problemas de programación de órdenes en máquinas. En este problema un productor dispone de una cierta cantidad de máquinas en las que debe procesar un conjunto de trabajos. Cada trabajo pertenece a una orden, correspondiente a un pedido de algún cliente. Por otra parte, los trabajos tienen asociado un tiempo de procesamiento, que puede depender de la máquina en que es procesado, y una fecha de disponibilidad a partir de la cual el trabajo puede ser programado. Finalmente, a cada orden se le asocia un peso que depende de cuán importante es la orden para el productor. El tiempo de completación de una orden es el instante de tiempo en que todos sus trabajos han sido procesados. El problema del productor es decidir cuándo y en qué máquina se procesa cada trabajo con el objetivo de minimizar la suma ponderada de los tiempos de completación de las órdenes. Este modelo generaliza varios problemas clásicos del área de programación de tareas. Por una parte, la función objetivo en nuestro modelo incluye como caso especial minimizar el tiempo total de procesamiento (makespan) y la suma ponderada de los tiempos de completación de los trabajos. Por otra parte, en esta memoria veremos que nuestro modelo también generaliza el problema de minimizar la suma ponderada de tiempos de completación de los trabajos en una máquina sujeto a restricciones de precedencia. Al ser estos problemas N P-duros, su aparente intratabilidad sugiere buscar algoritmos eficientes que entreguen una solución cuyo costo sea cercano al valor óptimo. Es con este objetivo que, basándose en relajaciones lineales indexadas en el tiempo, se propuso un algoritmo de 27/2-aproximación para la versión más general del problema descrito anteriormente. Este es el primer algoritmo con una garantı́a de aproximación constante para este problema, lo que mejora el resultado de Leung, Li y Pinedo (2007). Basado en técnicas similares, para el caso en que los trabajos pueden interrumpirse, también se obtuvo un algoritmo con una garantı́a de aproximación arbitrariamente cercana a 4. Además, se encontró un esquema de aproximación a tiempo polinomial (PTAS) para el caso en que las ordenes son disjuntas, y cuando se dispone de una cantidad constante de máquinas idénticas. Más aún, se pudo concluir que una variante de este esquema de aproximación se puede aplicar en el caso en que la cantidad de máquinas es parte de la entrada del algoritmo, pero la cantidad de trabajos por orden o la cantidad de órdenes es constante. Finalmente, se estudió el problema de minimizar el makespan en máquinas no relacionadas. Se propuso un algoritmo que transforma una solución con trabajos interrumpibles a una donde ningún trabajo es interrumpido, aumentando el makespan en a lo más un factor 4. Más aún, se demostró que no es posible encontrar un algoritmo que haga lo mismo incrementando el makespan en un factor menor. i Agradecimientos En primer lugar me gustarı́a agradecer a mis padres y hermanos, que desde niño me inculcaron el gusto por pensar. Su apoyo constante me ayudó toda mi carrera a seguir adelante. Agradezco a mi hermano Rodrigo que siempre estuvo dispuesto a escucharme y a discutir mi redacción. A mi amada esposa Natalia, que con su ayuda, cariño, paciencia y apoyo incondicional me ayudó a sacar adelante esta memoria. Muy especialmente le agradezco a mi profesor guı́a José R. Correa, que con mucha paciencia me introdujo al mundo de la investigación. Más que solo ayudarme y encaminar mi trabajo, me brindó amistad, comprensión y apoyo en general. Sin su constante apoyo esta memoria no podrı́a haberse llevado a cabo. A todos los alumnos del departamento de matemáticas de la U. de Chile, por siempre estar dispuestos a conversar y subirme el ánimo. Agradezco también a Martin Skutella quién me acogió en mi pasantı́a en Alemania durante Septiembre y Octubre del 2007. Gracias a su colaboración e importantes aportes se pudo desarrollar el Capı́tulo 5 de esta memoria. También agradezco a todo su grupo en la TUBerlin, por hacerme mi estadı́a en Berlin muy agradable. También a Nicole Megow por brindarme su amistad y apoyo. ii Acknowledgments First of all, I want to thank my parents and brothers for instilling in me the love of thinking. Their constant support helped me throughout all my career. I thank my brother Rodrigo for always listen to me and discuss my writing. To my loving wife Natalia, that with her help, love, patience and unconditional support helped me finishing this thesis. I specially thank my advisor José R. Correa, that through long hours of discussions introduced me to the world of investigation. More than only help me in my work, he gave me friendship and support in general. Without his constant support this thesis would not have carried out successfully. To all the students of the mathematics department of the University of Chile, for always being willing to talk and cheer me up. I also thank Martin Skutella who received me in my staying in Germany during September and October 2007. His collaboration and important contributions made possible Chapter 5 of this writing. I also thank all his group in TU-Berlin, for making pleasant my staying in Berlin. I also thank Nicole Megow for offering me her friendship and support. iii Contents 1 Resumen en español vi 1.1 Problemas de programación de tareas en máquinas . . . . . . . . . . . . . . vi 1.2 Algoritmos de aproximación . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1.3 Esquemas de aproximación a tiempo polinomial . . . . . . . . . . . . . . . . xiii 1.4 Definición del problema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii 1.5 Trabajo previo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv 1.5.1 Una máquina . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv 1.5.2 Máquinas paralelas . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi 1.5.3 Máquinas no relacionadas . . . . . . . . . . . . . . . . . . . . . . . . xvii Contribuciones de este trabajo . . . . . . . . . . . . . . . . . . . . . . . . . . xviii 1.6.1 Capı́tulo 3: El poder de la interrumpibilidad para R||Cmax . . . . . . xviii 1.6.2 Capı́tulo 4: Algoritmos de aproximación para minimizar P wL CL en máquinas no relacionadas . . . . . . . . . . . . . . . . . . P Capı́tulo 5: Un PTAS para minimizar wL CL en máquinas paralelas 1.6 1.6.3 1.7 Conclusiones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Introduction xxi xxv xxix 1 2.1 Machine scheduling problems . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2.2 Approximation algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3 Polynomial time approximation schemes . . . . . . . . . . . . . . . . . . . . 7 2.4 Problem definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.5 Previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.5.1 Single machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.5.2 Parallel machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.5.3 Unrelated machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 iv 2.6 3 On 3.1 3.2 3.3 Contributions of this work . . . . . . . . . . . . . . . . . . . . . . . . . . . . the power of preemption on R||Cmax R|pmtn|Cmax is polynomially solvable . A new rounding technique for R||Cmax Power of preemption of R||Cmax . . . . 3.3.1 Base case . . . . . . . . . . . . 3.3.2 Iterative procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 18 18 22 24 25 26 P 4 Approximation algorithms for minimizing wL CL on unrelated machines 32 4.1 A (4 + ε)−approximation algorithm for P R|rij , pmtn| wL CL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 P 4.2 A constant factor approximation for R|rij | wL CL . . . . . . . . . . . . . . 37 P 5 A PTAS for minimizing wL CL on parallel machines 5.1 Algorithm overview. . . . . . . . . . . . . . . . . . . . 5.2 Localization . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Polynomial Representation of Order’s Subsets . . . . . 5.4 Polynomial Representation of Frontiers . . . . . . . . . 5.5 A PTAS for a specific block . . . . . . . . . . . . . . . 5.6 Variations . . . . . . . . . . . . . . . . . . . . . . . . . 6 Concluding remarks and open problems v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 44 46 53 55 57 61 63 Capı́tulo 1 Resumen en español 1.1 Problemas de programación de tareas en máquinas Los problemas de programación de tareas en máquinas (machine scheduling) tratan sobre la asignación de recursos escasos a través del tiempo. Ellos surgen en distintos escenarios, como por ejemplo, un sitio de construcción donde el jefe debe asignar trabajos a cada empleado, una CPU que debe procesar tareas requeridas por varios usuarios, o en las lı́neas de producción de una fábrica que debe procesar productos para sus clientes. En general, una instancia de un problema de programación en máquinas consiste en un conjunto de n trabajos J y un conjunto de m máquinas M . Una solución del problema, i.e. una programación de las tareas o schedule, es una asignación que especifica en que máquina i ∈ M y en que instante de tiempo se procesa cada trabajo. Para clasificar problemas de programación de tareas debemos mirar las distintas caracterı́sticas o atributos de las máquinas y los trabajos, además de la función objetivo a minimizar. Una de estas es el “ambiente de máquinas”, que describe la configuración de las máquinas en nuestro modelo. Por ejemplo, podemos considerar maquinas “idénticas” o “paralelas”, donde cada máquina es una copia idéntica de todas las otras. En este caso cada trabajo j ∈ J toma pj unidades de tiempo en procesarse, independiente de la máquina al cual fue asignado. Por otro lado, podemos considerar una situación más general donde cada máquina i ∈ M tiene una velocidad asociada si , tal que el tiempo que toma un trabajo j en procesarse es inversamente proporcional a la velocidad de la máquina. Adicionalmente, los problemas de programación de tareas pueden se clasificados según las caracterı́sticas de los trabajos. Por ejemplo, nuestro modelo podrı́a considerar trabavi jos “ininterrumpibles”, i.e. trabajos que una vez que se empiezan a procesar no pueden ser interrumpidos hasta que hayan sido completamente procesados. Por otro lado, también podrı́amos considerar trabajos “interrumpibles”, donde tenemos la libertad de interrumpir un trabajo que haya empezado a procesarse, para después reanudar su procesamiento en la misma máquina u otra distinta. Por último, podemos clasificar problemas dependiendo de su función objetivo. Una de las funciones objetivos más naturales corresponde a minimizar el makespan, que se define como el instante de tiempo donde el último trabajo termina de procesarse. Más precisamente, si para una cierta programación de tareas definimos el “tiempo de completación” de un trabajo j ∈ J, denotado Cj , como el instante donde j termina de procesarse, entonces minimizar el makespan corresponde a minimizar Cmax := maxj∈J Cj . Otro ejemplo clásico de función objetivo consiste en minimizar el número de trabajos tardı́os. En este caso, cada trabajo j ∈ J tiene asociado una “fecha de entrega” (deadline) dj , y el objetivo es el de minimizar la cantidad de trabajos que terminan de ser procesados después de su fecha de entrega. Como éstas, hay una gran cantidad de distintas funciones objetivo que pueden ser consideradas. Una gran cantidad de problemas de programación de tareas pueden formarse combinando las caracterı́sticas recién mencionadas. Es por esto que es conveniente introducir una notación estándar que describa cada uno de estos problemas. Graham, Lawler, Lenstra y Rynooy Kan [20] propusieron la “notación de tres campos”, donde un problema de programación de tareas es representado por una expresión de la forma α|β|γ donde α denota el ambiente de máquinas, β contiene restricciones extras o caracterı́sticas del problema, y el último campo γ denota la función objetivo. En lo que sigue describimos algunos valores comunes que toma cada campo α, β y γ. 1. Valores de α. • α = 1 : Una máquina. Tenemos una sola máquina para procesar los trabajos. Cada trabajo j ∈ J toma un tiempo dado pj en ser procesado. • α = P : Máquinas paralelas. Tenemos un número m de máquinas idénticas o paralelas donde procesar los trabajos. Por ende el tiempo de proceso de un trabajo j está dado por pj , el cuál no depende de la máquina donde j es asignado. • α = Q: Máquinas relacionadas. En este caso cada máquina i ∈ M tiene una velocidad si asociada. Con esto, el tiempo de proceso del trabajo j ∈ J en la máquina i esta dado por pj /si , donde pj es el tiempo que demora en procesarce el vii trabajo j una máquina de velocidad 1. • α = R: Máquinas no relacionadas. En este caso más general no hay ninguna relación a priori entre los tiempos de proceso de cada trabajo en cada máquina, con lo que el tiempo que toma en procesar un trabajo j en una máquina i está descrito por un número arbitrario pij . Adicionalmente, en el caso de que α = P, Q o R, podemos añadir la letra m al final del campo, indicando que la cantidad de máquinas disponibles es constante. Luego, por ejemplo, si el modelo de máquinas paralelas considera un número fijo m de máquinas lo denotamos por α = P m. El valor de m también puede ser especificado, e.g., α = P 2 significa que disponemos de dos máquinas paralelas para procesar los trabajos. 2. Valores de β. • β = pmpt: Trabajos interrumpibles. En este caso consideramos trabajos que si pueden ser interrumpidos (una o varias veces) antes de ser terminados, los cuales deben completarse posteriormente en la misma máquina u en otra distinta. • β = rj : Tiempos de disponibilidad. Cada trabajo tiene asociado un instante de tiempo rj a partir del cuál puede empezar a ser procesado. • β = prec: Restricciones de precedencia. Consideremos una relación de orden parcial sobre los trabajos (J, ≺). Si para algún par de trabajos j y k se tiene que j ≺ k, entonces k debe empezar a procesarse después de el tiempo de completación del trabajo j. 3. Valores de γ. • γ = Cmax : Makespan. El objetivo es minimizar el makespan dado por Cmax := maxj∈J Cj , donde Cj corresponde al instante de tiempo donde el trabajo j terminó de procesarse, i.e. a su tiempo de completación P • γ = Cj : Tiempo de completación promedio. Se debe minimizar el tiempo de completación promedio, o equivalentemente, la suma de los tiempos de completaP ción j∈J Cj . P • γ= wj Cj : Suma ponderada de tiempos de completación. Se considera un peso wj asociado a cada trabajo j ∈ J, que describe cuan importante es tal trabajo. viii Luego, el objetivo es minimizar la suma ponderada de los tiempo de completación, P j∈J wj Cj . Cabe destacar que por convención se consideran trabajos ininterrumpibles por defecto. Es decir, cuando el campo β esta vacı́o significa que los trabajos no pueden ser interrumpidos P hasta que sean completados. Por ejemplo, R|| wj Cj denota el problema de encontrar una programación de tareas en J sobre el conjunto de máquinas M , sin interrumpir ningún trabajo, donde cada trabajo j ∈ J toma pij unidades de tiempo en procesarse en la máquina P i ∈ M , y se minimiza la suma ponderada de los tiempos de completación wj Cj . Por otra P parte, R|pmpt| wj Cj denota el mismo el problema recién descrito, con la diferencia de que permitimos interrupciones de los trabajos. Además, notemos que el campo β puede tomar P más de un solo valor. Por ejemplo, R|pmpt, rj | wj Cj es el mismo que el problema anterior, con la restricción adicional de que cada trabajo j debe empezar a procesarse después del tiempo rj . De todos los problemas de programación de tareas, hay una gran cantidad que son N Pduros, y por lo tanto no admiten algoritmos a tiempo polinomial que los resuelvan a optimalidad, a menos que P = N P. En particular, es fácil demostrar que uno de los problemas fundamentales en el área, P 2||Cmax , es N P-completo. En lo que sigue describiremos algunas técnicas generales que se pueden ocupar para abordar problema de optimización que son N P-duros, además de algunas de sus aplicaciones al área de programación de tareas. 1.2 Algoritmos de aproximación La introducción de la clase de problemas N P-completos dada por Cook [11], Karp [24] e independientemente Levin [31], dejó grandes desafios abiertos sobre como estos pueden ser abordados dada su aparente intratabilidad. Una opción que ha sido estudiada a profundidad es la de algoritmos que resuelven el problema a optimalidad, pero que no tienen una cota superior polinomial sobre el tiempo de ejecución. Este tipo de algoritmos puede ser útil sobre instancias de tamaño pequeño o mediano, o para instancias con alguna estructura especial en donde los tiempos de ejecución son realizables en la práctica. Sin embargo, puede haber otro tipo de instancias donde el algoritmo toma un tiempo exponencial en terminar, lo cual restringe su utilidad práctica. Los más comunes de estos enfoques son Branch and Bound, Branch and Cut y técnicas de Programación Entera. Para los problemas de optimización N P-duros, otra alternativa es la de usar algoritmos ix que corren en tiempo polinomial, pero que no resolverán el problema a optimalidad. Entre estos, una clase particularmente interesante de algoritmos son los “algoritmos de aproximación”, en donde la solución calculada está garantizada de tener un costo cercano al del óptimo. Más precisamente, consideremos un problema de minimización P con función de costos c. Para un cierto α ≥ 1, decimos que una solución S de P es una “α-aproximación” si su costo c(S) está a un factor α del costo óptimo OP T , i.e., si c(S) ≤ α · OP T. (1.1) Con esto, consideremos un algoritmo A cuya salida sobre una instancia I esta dada por A(I). Decimos que A es un algoritmo de “α-aproximación” si A(I) es una α-aproximación para cualquier instancia I. El número α se llama el “factor de aproximación” de A, y si α no depende de la entrada de A decimos que el algoritmo es un algoritmo de aproximación a un factor constante. Análogamente, si ahora P es un problema de maximización con función objetivo c, una solución S es una α-aproximación, para un α ≤ 1, si c(S) ≥ α · OP T. Como antes, un algoritmo A es un algoritmo de α-aproximación si A(I) es una α-aproximación para cada instancia I de P . En lo que sigue, solamente estudiaremos problemas de minimización, por lo que esta última definición no será usada. Uno de los primeros algoritmos de aproximación fue dado por R.L. Graham [19] en 1966, incluso antes de que la noción de N P-completitud haya sido formalmente introducida. Graham estudió el problema de minimizar el makespan en máquinas paralelas, P ||Cmax , proponiendo el siguiente algoritmo glotón: (1) Ordenar los trabajos arbitrareamente, (j1 , . . . , jn ); (2) Para cada k = 1, . . . , n, procesar el trabajo jk en la máquina en donde terminarı́a primero. A un procedimiento de este tipo se le llama un algoritmo de list-scheduling. Lema 1.1 (Graham 1966 [19]). List-scheduling es un algoritmo de (2 − 1/m)-aproximación para P ||Cmax . Demostración. Primero notemos que si OP T denota el makespan de la solución óptima, entonces 1 X OP T ≥ pj , (1.2) m j∈J x ya que el lado derecho de esta ecuación corresponde al tiempo promedio de procesamiento de las m máquinas, que debe ser menor que el makespan óptimo. Ahora bien, sea ℓ ∈ {1, . . . , n} tal que Cjℓ = Cmax , y denotemos por Sj = Cj − pj el tiempo en que se comienza a procesar el trabajo j. Luego, notando que en el instante Sjℓ todas las máquinas están ocupadas, se tiene que ℓ−1 1 X S jℓ ≤ pj , m k=1 k con lo cual ℓ Cmax 1 1 X pjk + (1 − )pjℓ ≤ = S jℓ + p jℓ ≤ m k=1 m 1 2− m OP T, (1.3) donde la última desigualdad sigue de (1.2) y del hecho de que pjℓ ≤ OP T , ya que ninguna programación de tareas puede terminar antes de pj para ningún j ∈ J. Como se puede observa en la demostración, un paso crucial en el análisis previo es el de obtener una “buena” cota inferior de la solución óptima (por ejemplo Ecuación (1.2) en el lema previo), para después usarla para acotar superiormente la solución dada por el algoritmo (como en la Ecuación (1.3)). La mayorı́a de las técnicas para encontrar cotas inferiores del óptimo dependen del problema, y por ende es difı́cil dar reglas generales de como hallarlas. Una de las pocas excepciones que ha mostrado ser útil en una diversidad de contextos, corresponde a formular el problema como un problema de programación lineal entero, y luego relajar las condiciones de integralidad. Claramente, la solución del problema relajado es una cota inferior del problema original. Un algoritmo que ocupa esta técnica para encontrar cotas inferiores es comúnmente llamado “algoritmo de aproximación basado en un PL”. Para ilustrar esta idea consideramos el siguiente problema. Vertex-Cover de Costo Mı́nimo: Entrada: Un grafo G = (V, E) y una función de costos c : V → Q sobre los vértices. Objetivo: Encontrar un vertex-cover , i.e. un conjunto B ⊆ V tal que cada arista en E P intersecta a algún vértice en B, minimizando el costo c(B) = v∈B c(v). Es fácil ver que este problema es equivalente al siguiente problema entero: xi [PL] min X yv c(v) (1.4) v∈V yv + yw ≥ 1 yv ∈ {0, 1} para todo vw ∈ E, (1.5) para todo v ∈ V. (1.6) Luego, reemplazando la Ecuación (1.6) por yv ≥ 0, obtenemos un programa lineal cuyo valor óptimo es una cota inferior del problema de Vertex-Cover de Costo Mı́nimo. Para obtener un algoritmo de aproximación a un factor constante, procedemos como sigue. Primero resolvemos [PL] (ocupando, por ejemplo, el método del elipsoide), y llamamos la solución yv∗ . Para redondear esta solución fraccionaria, notemos que la Ecuación (1.5) implica que para cada arista vw ∈ E, ya sea yv∗ ≥ 1/2 o yw∗ ≥ 1/2. Luego, el conjunto B = {v ∈ V |yv∗ ≥ 1/2} es un vertex-cover, y más aún podemos acotar su costo de la siguiente manera, c(B) = X v:yv∗ ≥1/2 c(v) ≤ 2 X v∈V yv∗ c(v) ≤ 2OP TP L ≤ 2OP T, (1.7) donde OP T es el valor óptimo del problema de Vertex-Cover de Costo Mı́nimo, y OP TP L es el óptimo de [PL]. Luego el algoritmo descrito es una 2-aproximación. Notando que OP T ≤ c(B), la Ecuación (1.7) implica que OP T ≤2 OP TP L para cualquier instancia I de Vertex-Cover de Costo Mı́nimo. Más generalmente, cualquier algoritmo de α-aproximación que usa OP TP L como cota inferior debe satisfacer max I OP T ≤ α. OP TP L El lado izquierdo de esta última ecuación se llama el “gap de integralidad” del programa lineal. Encontrar una cota inferior sobre el gap de integralidad de un programa lineal es una técnica usual para determinar cual es el mejor factor que puede tener un algoritmo de aproximación que ocupa el programa lineal como cota inferior. Para hacer esto basta con encontrar una instancia con un cociente OP T /OP TP L lo más grande posible. Por ejemplo, es fácil ver que el algoritmo recién descrito para Vertex-Cover de Costo Mı́nimo es el xii mejor posible ocupando [PL] como cota inferior. En efecto, si consideramos el grafo G como el grafo completo de n vértices y función de costos c ≡ 1, obtenemos que OP T = n − 1 y OP TP L = n/2, con lo que OP T /OP TP L → 2 cuando n → ∞. 1.3 Esquemas de aproximación a tiempo polinomial Dado un cierto problema de optimización N P-duro, es natural preguntarse cual es el algoritmo de aproximación que corre en tiempo polinomial con el mejor factor de aproximación. Claramente, esto depende de cada problema. Por un lado, hay algunos problemas que no admiten ningún algoritmo de aproximación a menos que P = N P. Por ejemplo, el problema del vendedor viajero con costos binarios no puede ser aproximado a ningún factor. En efecto, si existe un algoritmo de α-aproximación para este problema, entonces podrı́amos decidir si es que existe o no un circuito hamiltoniano de costo cero: Si la solución óptima del problema del vendedor viajero es cero, el algoritmo de aproximación debe retornar cero por (1.1), independiente del valor de α; Si el valor del óptimo es mayor que cero, la solución dado por el algoritmos también lo será. Por otro lado, hay algunos problemas que admiten algoritmos con factores de aproximación arbitrareamente buenos. Para formalizar esta idea definimos un “esquema de algoritmos de aproximación en tiempo polinomial” (polynomial time approximation scheme o PTAS), como una familia de algoritmos {Aε }ε>0 tal que cada Aε es un algoritmo de (1+ε)-aproximación que corre en tiempo polinomial. Cabe destacar que ε no es considerado como parte del input, y por lo tanto el tiempo de ejecución del algoritmo puede depender exponencialmente de ε. Por otra parte, un PTAS donde el tiempo de ejecución de los Aε dependen polinomialmente en 1/ε se le llama un “esquema de algortimos de aproximación totalmente polinomial” (fully polynomial time approximation scheme o FPTAS). 1.4 Definición del problema En esta memoria estudiaremos un problema de programación de tareas que surge naturalmente en la industria. Consideremos una situación donde clientes hacen pedidos, consistentes en varios productos, a un fabricante que dispone de un conjunto de máquinas para procesarlas. Cada producto debe ser procesado en alguna de las m máquinas disponible para ello, y el tiempo que se demora en procesar cada trabajo puede depender de la máquina en donde xiii es programado. El productor debe decidir una programación de las tareas con el objetivo de entregar el mejor servicio posible a sus clientes. En su versión más general, el problema a considerar es como sigue. Tenemos un conjunto S de trabajos J y un conjunto de órdenes O ⊆ P(J), tal que L∈O L = J. Cada trabajo j ∈ J tiene asociado un valor pij que representa su tiempo de proceso en la máquina i ∈ M . Además, a cada orden L ∈ O le asociamos un peso wL , dependiendo de cuan importante es la orden para el productor. Por otra parte, cada trabajo j tiene asociado un tiempo de disponibilidad dependiente de la máquina donde el trabajo es procesado, rij , tal que j no puede comenzar a procesarse en i antes de rij . Una orden esta completada cuando todos los trabajos pertenecientes a ella han sido completamente procesados. Luego, si Cj denota el instante de tiempo en donde se termino de procesar el trabajo j, entonces CL = max{Cj : j ∈ L} corresponde al tiempo de completación de la orden L ∈ O. El objetivo del productor es el de encontrar una programación de las tareas (sin interrumpirlas) en las m máquinas tal que se minimice la “suma ponderada de los tiempos de completación de las órdenes”, i.e., min X wL CL . L∈O Cabe destacar que en lo recién descrito permitimos que un trabajo pertenezca a más de una orden simultaneamente, i.e., las órdenes en O pueden no ser dos a dos disjuntas. Para adoptar la notación de tres campos de Graham et al., denotaremos este problema P P como R|rij | wL CL , o R|| wL CL en el caso en que todos los tiempos de disponibilidad sean 0. Cuando los tiempo de proceso pij no dependan de la máquina, cambiaremos la “R” por una “P ”, indicando que estamos en el caso de máquinas paralelas. Además, cuando imponemos la condición extra de que las órdenes forman una partición de J añadimos part como parte del segundo campo β. Como mostraremos más adelante, nuestro problema generaliza varios problemas clásicos P P de programación de tareas. Estos incluyen R||Cmax , R|rij | wj Cj y 1|prec| wj Cj . Ya que todos estos problemas son fuertemente N P-duro (ver por ejemplo [17]), también nuestro problema más general lo será. Sorprendentemente, el mejor factor de aproximación para cada uno de estos problemas es de 2 [4, 35, 37]. Sin embargo, en nuestro caso más general, ningún algoritmo de aproximación a factor constante es conocido. El mejor resultado, dado por Leung, Li, Pinedo y Zhang [29], es un algoritmo para el caso especial de máquinas relacionadas (i.e. pij = pj /si , donde si xiv denota la velocidad de la máquina i) y todos los tiempos de disponibilidad son cero. El factor de aproximación de este algoritmo es 1 + ρ(m − 1)/(ρ + m − 1), donde ρ es la razón entre las velocidades de la máquina más rápida y la más lenta. En general esta cota no es constante, y puede ser tan mala como m/2. 1.5 Trabajo previo Para ilustrar la flexibilidad de nuestro modelo, presentamos varios modelo que nuestro problema generaliza, además de dar una reseña de los resultados más importantes conocidos sobre ellos. 1.5.1 Una máquina Comenzamos considerando el problema de minimizar la suma ponderada de tiempos de completación de órdenes en una máquina. Primero estudiamos el caso donde ningún trabajo P P pertenece a más de una orden, 1|part| wL CL , mostrando que es equivalente a 1|| wj Cj . Smith [41] mostró que este último problema se puede resolver en tiempo polinomial ocupando un algoritmo de tipo list-scheduling ordenando los trabajos de manera no-creciente en wj /pj . Este algoritmo glotón es conocido como la “regla de Smith”. Para ver que los dos problemas mencionados son en efecto equivalentes, primero mosP tramos que existe una solución óptima de 1|part| wL CL en donde todos los trabajos de una orden L ∈ O son procesados de manera consecutiva. Para ver esto, consideremos una solución óptima donde esto no ocurre. Luego, existen trabajos j, ℓ ∈ L y k ∈ L′ 6= L, tal que k empieza a procesarse en el instante Cj , y ℓ es procesado después que k. Intercambiando los trabajos j y k, i.e. adelantando k en pj unidades de tiempo y retrasando j en pk unidades de tiempo, no incrementa el costo de la solución. En efecto, el trabajo k disminuye su tiempo de completación, y por lo tanto CL′ no aumenta. La orden L tampoco aumenta su tiempo de completación puesto que el trabajo ℓ ∈ L, que esta siempre siendo procesado después de j, se mantiene sin modificar. Iterando este argumento terminamos con una programación de las tareas donde todos los trabajos de una orden son procesados de manera consecutiva. Por lo tanto, cada orden puede ser vista como un trabajo más grande con tiempo de proceso igual P P P a j∈L pj , y por ende 1|part| wL CL se reduce a 1|| wj Cj . P Ahora consideramos el problema más general de 1|| wL CL , donde permitimos que un trabajo pertenezca a más de una orden simultaneamente. Se puede probar que este problema xv es equivalente a 1|prec| P wj Cj (ver Capı́tulo 2.5.1), lo que resumimos en el siguiente teorema. Teorema 1.2. Existe un algoritmo de α-aproximación para 1|prec| P para 1|| wL CL . P wj Cj ssi existe uno P El problema con restricciones de precedencia 1|prec| wj Cj ha recibido mucho atención desde los sesenta. Lenstra y Rinnooy Kan [26] probaron que este problema es fuertemente N P-duro, incluso si los pesos o los tiempos de proceso son unitarios. Por otro lado, varios algoritmos de 2-aproximación han sido propuestos: Hall, Schulz, Shmoys y Wein [21] dieron un algoritmo basado en una relajación lineal, mientras que Chudak & Hochbaum [6] propusieron otro algoritmo de 2-aproximación basado en una relajación lineal semi-entera. Además, Chekuri y Motwani [4], y Margot, Queyranne y Wang [32] independientemente desarrollaron un algoritmo combinatorial sencillo con un factor de aproximación 2. Mas aún, los resultados P en [2, 12] implican que 1|prec| wj Cj es un caso especial de vertex cover. Por otro lado, no hubo conocimiento sobre la dificultad de aproximar este problema, hasta que recientemente Ambuhl, Mastrolilli y Svensson [3] demostraron que este problema no admite un PTAS a menos que problemas N P-duros pueden ser resueltos en tiempo aleatorio subexponencial. 1.5.2 Máquinas paralelas En esta sección hablaremos sobre problemas de programación de tareas en máquinas paralelas, donde los tiempos de proceso de cada trabajo j, están dados por pij = pj que no dependen de la máquina en donde j es procesado. Recordemos el problema previamente definido de minimizar el makespan en máquinas paralelas, P ||Cmax , que consiste en encontrar una programación de un conjunto de tareas J en un conjunto M de m máquinas paralelas, tal que se minimice el máximo tiempo de P completación. Notemos que si en nuestro problema P || wL CL el conjunto O solo contiene una orden, la función objetivo se vuelve maxj∈J Cj = Cmax y por lo tanto P ||Cmax es un P caso especial de P || wL CL , que a su vez es un caso especial de nuestro modelo más general P R|rij | wL CL . El problema de P ||Cmax es un problema clásico en el área de programación de tareas. Se puede probar fácilmente que es N P-duro, incluso con m = 2, ya que el problema de 2-partición se puede reducir a él. Por otra parte, como mostramos en el Lema 1.1, un algoritmo de tipo list-scheduling es un algoritmo de 2-aproximación. Más aun, Hochbaum y Shmoys [22] presentaron un PTAS para este problema (ver también [42, Chapter 10]). xvi Por otro lado, cuando en nuestro modelo cada orden contiene un solo trabajo, el problema P se vuelve equivalente a minimizar la suma ponderada de tiempos de completación j∈J wj Cj . P Por ende nuestro problema también generaliza P || wj Cj . El estudio de este problema también se remonta a los sesenta (ver por ejemplo [9]). Al igual con P ||Cmax , el problema es N P-duro inclusive cuando hay solo dos máquinas para procesar los trabajos. Por otra parte, una seguidilla de algoritmos de aproximación fueron propuestos hasta que Skutella y Woeginger [40] encontraron un PTAS para este problema. Más adelante, Afrati et al. [1] extendieron este resultado al caso con fechas de disponibilidad no triviales. Con esto, surge la pregunta de si existe un PTAS para P |part|wL CL (recordemos que, P como se discutió en la Sección 1.5.1, el problema levemente más general P || wL CL no tiene PTAS a menos que los problemas N P-duros puedan ser resueltos en tiempo aleatorio subexponencial). Aunque no sabemos si es que lo último es cierto, Leung, Li, y Pinedo [28] (ver también Yang y Posner [44]) presentaron una algoritmo de 2-aproximación para este problema, que es lo mejor que se conoce hasta el momento. 1.5.3 Máquinas no relacionadas En el caso más general de máquinas no relacionadas, nuestro problema también generaliza varios problemas clásicos del área de programación de tareas. Como antes, si hay una sola orden y rij = 0, nuestro problema se vuelve equivalente a R||Cmax . Lenstra, Shmoys y Tardos [27] dieron un algoritmo de 2-aproximación para R||Cmax , y mostraron que no se puede obtener un algoritmo con una garantı́a mejor que 3/2 a menos que P = N P. Luego, tenemos P el mismo resultado para nuestro problema mas general de R|| wL CL . Por otro lado, si las órdenes son singletons y los tiempos de disponibilidad triviales, i.e. P rij = 0, nuestro problema se vuelve R|| wj Cj . Como en el caso de makespan, este último problema es APX-duro [23] y por lo tanto no admite un PTAS a menos que P = N P. Sin embargo, Schulz y Skutella [35] usaron una relajación lineal para diseñar un algoritmo de (3/2 + ε)-aproximación en el caso con fechas de disponibilidad iguales a cero, y de (2 + ε)aproximación cuando las fechas de disponibilidad son no triviales. Más aún, Skutella [38] refinó este resultado usando programación cuadrática convexa, obteniendo un algoritmo de 3/2-aproximación cuando rij = 0, y de 2-aproximación con fechas de disponibilidad arbitrarias. Finalmente cabe mencionar que nuestro problema también generaliza el problema de P lı́nea de ensamblaje, A|| wj Cj , el cuál ha recibido bastante atención recientemente (ver xvii e.g. [7, 8, 30]). Una instancia de este problema consta de un conjunto de M máquinas y un conjunto de trabajos J, con pesos asociados wj . Cada trabajo j ∈ J consta de m partes, tal que la i-ésima parte de j debe ser procesada en la i-ésima máquina donde toma pij unidades de tiempo en ser procesada. El objetivo es el de minimizar la suma ponderada P de tiempos de completación, wj Cj , donde el tiempo de completación de un trabajo en este contexto se define como el instante de tiempo en que la último de sus partes termina de procesarse. Para ver que nuestro problema generaliza el de lı́nea de ensamblaje, basta hacer la correspondencia de una orden de nuestro problema con cada trabajo de la lı́nea de ensamblaje, y a sus respectivos partes con los trabajos que pertenecen a cada orden. Para asegurar que los trabajos en cada orden solo puedan ser procesados en su respectiva máquina, le asignamos un tiempo de proceso infinito (o suficientemente largo) en todas las otras máquinas. Además de probar que el problema de lı́nea de ensamblaje es N P-duro, Chen y Hall [7] y Leung, Li, y Pinedo [30] dieron, de manera independiente, un algoritmo de 2-aproximación usando un programa lineal basado en las llamadas parallel inequalities (ver también [33]). 1.6 Contribuciones de este trabajo P En esta memoria desarrollamos algoritmos de aproximación para R|rij | wL CL y algunos casos particulares de este. A continuación resumimos cada uno de los capı́tulos. 1.6.1 Capı́tulo 3: El poder de la interrumpibilidad para R||Cmax En este capı́tulo estudiamos el problema de minimizar el makespan en máquinas no relacioP nadas R||Cmax , el cual es un caso particular de nuestro problema más general R|| wL CL . Las técnicas desarrolladas en este capı́tulo darán pie a técnicas para encontrar algoritmos de P P aproximación para los casos más generales R|rij | wL CL y R|rij , pmpt| wL CL . En primer lugar revisamos el resultado de Lawler y Labetoulle [25] que muestra que el problema de R|pmpt|Cmax puede ser resuelto en tiempo polinomial. Este resultado se basa en demostrar que hay una correspondencia uno a uno entre una programación de tareas con trabajos interrumpibles y una solución de un programa lineal. Este programa lineal, que llamaremos [LL] y que está descrito a continuación, ocupa variables de asignación xij que denotan la fracción del trabajo j que es procesado en al máquina i ∈ M . xviii [LL] min C X xij = 1 para todo j ∈ J, (1.8) pij xij ≤ C para todo i ∈ M, (1.9) pij xij ≤ C para todo j ∈ J, (1.10) xij ≥ 0 para todo i, j. (1.11) i∈M X j∈J X i∈M Es claro que cada programación de tareas que interrumpe trabajos induce una solución factible de [LL]. En efecto, dada una programación de tareas que interrumpe trabajos, sea C su makespan y xij la fracción del trabajo j que es procesada en la máquina i. Luego, se debe satisfacer la Ecuación (1.8) ya que cada trabajo se procesa completamente. Más aún, la Ecuación (1.9) también se satisface ya que ninguna máquina i ∈ M puede terminar de P procesar trabajos antes de j pij xij . Similarmente, la Ecuación (1.10) es válida puesto que ningún trabajo j puede ser procesado en dos máquinas simultaneamente, y por ende el lado izquierdo de esta ecuación es una cota inferior en el tiempo de completación del trabajo j. La implicancia contraria, es decir, que toda solución de [LL] induce una programación de tareas interrumpibles con makespan C, requiere más trabajo y ocupa técnicas de emparejamientos en grafos bipartitos. A continuación proponemos como redondear cualquier solución de [LL] a una programación de tareas no interrumpibles, aumentando el makespan en un factor de a lo más 4. Juntando esto con el hecho que [LL] da una cota inferior para el problema de R||Cmax , obtenemos una algoritmo de 4-aproximación para R||Cmax . Aunque esto no mejora la 2-aproximación dada por Lenstra, Shmoys y Tardos [27] para este problema, tiene la ventaja de que la técnica P de redondeo ocupada es fácil de generalizar a nuestro problema general R|rij | wL CL . Dado x y C solucion de [LL], el redondeo consiste en lo siguiente. 1. Comenzamos por llevar a cero las variables que procesan un trabajo en una máquina que demora mucho tiempo. Más precisamente, definimos yij =  0 x si pij > 2C, ij xix si no. Con esto, ningún trabajo se encuentra parcialmente asignado a una máquina en donde demorarı́a más de 2C unidades de tiempo en ser procesado. Sin embargo, ahora los trabajos no están completamente procesados en la solución fraccionaria y. Para solucionar esto reescalamos las variables, tal que la nueva solución, x′ , satisfaga (1.8). Gracias a la ecuación (1.10), es fácil ver que al hacer esto ninguna variable aumentó más que al doble. 2. Finalmente aplicamos a la solución x′ un famoso resultado de Shmoys y Tardos [37], el cual está sintetizado en el siguiente teorema: Teorema 1.3 (Shmoys y Tardos [37]). Dada una solución fraccionaria no negativa del siguiente sistema de ecuaciones: XX j∈J i∈M cij xij ≤ C, X (1.12) para todo j ∈ J, xij = 1 i∈M (1.13) existe una solución integral x̂ij ∈ {0, 1} que satisface (1.12), (1.13), y además, xij = 0, =⇒ x̂ij = 0 X X pij x̂ij ≤ pij xij + max{pij : xij > 0} j∈J j∈J para todo i ∈ M, j ∈ J, (1.14) para todo i ∈ M. Más aún, tal solución integral puede ser encontrada en tiempo polinomial. Es sencillo probar que el algoritmo recién descrito termina con una programación de tareas de trabajos no interrumpibles con makespan a lo más 4C, donde C es el makespan de la solución fraccionaria dada por x. Finalizamos el Capı́tulo 3 demostrando que el gap de integralidad de [LL] es exactamente 4, lo que implica que la técnica de redondeo recién descrita es la mejor que se puede obtener. Para ello construimos una familia de instancias de R||Cmax , {Iβ }β<4 , tal que si CβINT denota el makespan óptimo considerando trabajos no interrumpibles, y Cβ denota el valor óptimo de [LL], entonces CβINT /Cβ ≥ β, para todo β < 4. xx 1.6.2 Capı́tulo 4: Algoritmos de aproximación para minimizar P wL CL en máquinas no relacionadas P En este capı́tulo presentamos algoritmos de aproximación para el caso general R|rij | wL CL , P además de su versión con trabajos no interrumpibles, R|rij , pmtn| wL CL . La mayorı́a de las técnicas presentadas en este capı́tulo son generalización de los métodos mostrados en el Capı́tulo 3. P Primero mostramos un algoritmo de (4+ε)-aproximación para R|rij , pmtn| wL CL . Para ello consideramos un programa lineal indexado en el tiempo, cuyas variables representan la fracción de cada trabajo que es procesada en cada instante de tiempo (discreto) en cada máquina. Este tipo de relajación lineal fue originalmente introducido por Dyer y Wolsey [13] P para el problema 1|rj | j wj Cj , y fue posteriormente extendido por Schulz y Skutella [35], quienes lo usaron para obtener algoritmos de (3/2+ε)-aproximación y de (2+ε)-aproximación P P para R|| wj Cj y R|rij | wj Cj respectivamente. La relajación lineal considera un horizonte de tiempo T , suficientemente grande tal que sea una cota superior del makespan de cualquier programación razonable, por ejemplo P T = maxi∈M,k∈J {rik + j∈J pij }. Luego dividimos el horizonte de tiempo en intervalos que crecen de manera exponencial, tal que habrán solamente una cantidad polinomial O(log T ) de intervalos. Para ello, sea ε un parámetro fijo, y sea q el primer entero tal que (1 + ε)q−1 ≥ T . Luego, consideramos los intervalos [0, 1], (1, (1 + ε)], ((1 + ε), (1 + ε)2 ], . . . , ((1 + ε)q−2 , (1 + ε)q−1 ]. Para simplificar la notación, definimos τ0 = 0, y τℓ = (1 + ε)ℓ−1 para cada ℓ = 1, . . . , q. Con esto, el ℓ-ésimo intervalo corresponde a (τℓ−1 , τℓ ]. Dado una programación de tareas con trabajos interrumpibles, sea yjiℓ la fracción del trabajo j que es procesada en la máquina i en el ℓ-ésimo intervalo. Luego, pij yjiℓ es la cantidad de tiempo que el trabajo j utiliza en la máquina i en el ℓ-ésimo intervalo. Consideremos el siguiente programa lineal. xxi [DW] min X wL CL L∈O q XX para todo j ∈ J, (1.15) pij yjiℓ ≤ τℓ − τℓ−1 para todo ℓ = 1, . . . , q y i ∈ M, (1.16) pij yjiℓ ≤ τℓ − τℓ−1 para todo ℓ = 1, . . . , q y j ∈ J, (1.17) para todo L ∈ O, j ∈ L, (1.18) yjiℓ = 0 para todo j, i, ℓ tal que rij > τℓ , (1.19) yjiℓ ≥ 0 para todo i, j, ℓ. (1.20) yjiℓ = 1 i∈M ℓ=1 X j∈J X i∈M X i∈M yji1 + q X ℓ=2 τℓ−1 yjiℓ ! ≤ CL Es sencillo de verificar que este programa lineal da una cota inferior para nuestro problema P R|rij , pmtn| wL CL . En efecto, la Ecuación (1.15) asegura que cada trabajo es completamente procesado. La Ecuación (1.16) también debe ser válida ya que en cada intervalo ℓ y máquina i la cantidad total de tiempo disponible es a lo más τℓ − τℓ−1 . Similarmente, la Ecuación (1.17) se satisface puesto que ningún trabajo puede ser procesado en dos máquinas de manera simultanea, y por lo tanto en cada intervalo ℓ la cantidad total de tiempo que se puede ocupar para procesar un trabajo es a lo más el largo del intervalo. Para ver que la Ecuación (1.18) es válida, notemos que pij ≥ 1, y por lo tanto CL ≥ 1 para todo L ∈ O. También notemos que CL ≥ τℓ−1 para todo L, j ∈ L, i, ℓ tal que yjiℓ > 0. Por ende, el lado izquierdo de la Ecuación (1.18) es una combinación convexa de valores más pequeños que CL . Finalmente, la Ecuación (1.19) es válida ya que ninguna parte de un trabajo puede ser asignada a un intervalo que termina antes de su fecha de disponibilidad, cualquiera sea la máquina. Para obtener una (4+ε)-aproximación, primero resolvemos [DW], obteniendo una solución y ∗ , {CL∗ }L∈O . Luego, procedemos de manera análoga al redondeo del Capı́tulo 3, llevando a cero todas las variables que asignan un trabajo j ∈ L a un intervalo que empiece después que 2CL∗ . Después, reescalamos las variables para asegurar que todos los trabajos estén siendo completamente procesados. Es fácil ver que al hacer esto cada variable aumenta a lo más al doble. Finalmente, las ecuaciones (1.16) y (1.17) nos permiten ocupar la técnica de Lawler y xxii Labetoulle [25] sobre cada intervalo de tiempo [τℓ−1 , τℓ ) para asegurar que ningún trabajo este siendo procesado en dos máquinas al mismo tiempo, con lo que obtenemos una programación de las tareas en donde ningún trabajo j ∈ L se procese en un instante de tiempo posterior a 4(1 + ε)CL∗ . Con esto, podemos demostrar el siguiente teorema. Teorema 1.4. Para todo ε > 0, existe un algoritmo de (4+ε)-aproximación para el problema P de R|rij , pmpt| wL CL . A continuación proponemos el primer algoritmo de aproximación a un factor constante P para el caso de trabajos no interrumpibles R|rij | wL CL . Nuestro algoritmo esta basado en un programa lineal indexado en el tiempo propuesto por Hall, Schulz, Shmoys, y Wein [21], con variables que indican en qué máquina y qué intervalo se termina de procesar cada trabajo, además de la técnica de redondeo desarrollada en el Capı́tulo 3. Al igual que antes, consideramos un horizonte de tiempo T más grande que el makespan P de cualquier programación de tareas razonable, por ejemplo T = maxi∈M,k∈J {rik + j∈J pij }. También, dividimos el horizonte de tiempo en intervalos que crecen exponencialmente en un factor 3/2, [1, 1], (1, 3/2], (3/2, (3/2)2 ], . . . , ((3/2)q−2 , (3/2)q−1 ]. Para simplificar la notación, definimos τ0 = 1, y τℓ = (3/2)ℓ−1 , para todo ℓ = 1 . . . q. Con esto, el ℓ-ésimo intervalo corresponde a (τℓ−1 , τℓ ]. Dada una programación de tareas, definimos las variables yjiℓ como uno si y solo si el trabajo j termina de procesarse en la máquina i en el ℓ-ésimo intervalo. Con esto en mente consideramos el siguiente programa lineal. xxiii [HSSW] min X wL CL L∈O q XX yjiℓ = 1 para todo j ∈ J, (1.21) pij yjis ≤ τℓ para todo i ∈ M y ℓ = 1, . . . , q, (1.22) para todo L ∈ O, j ∈ L, (1.23) yjiℓ = 0 para todo i, ℓ, j tal que pij + rij > τℓ , (1.24) yjiℓ ≥ 0 para todo i, l, j. (1.25) i∈M ℓ=1 ℓ X X s=1 j∈J q XX i∈M ℓ=1 τℓ−1 yjiℓ ≤ CL Para ver que [HSSW] es una relajación de nuestro problema, consideremos una programación de trabajos ininterrumpibles arbitrarea, y definamos yjiℓ = 1 ssi el trabajo j termina de ser procesado en la máquina i en el ℓ-ésimo intervalo. Luego, la Ecuación (1.21) es válida ya que cada trabajo termina en exactamente un intervalo y una máquina. El lado izquierdo de (1.22) corresponde a la carga total procesada en la maquina i en el intervalo [0, τℓ ]. y por lo tanto la desigualdad se satisface. La suma doble en la desigualdad (1.23) es igual a τℓ−1 , donde ℓ es el intervalo donde el trabajo j se completa, por lo que es a lo más Cj , y por ende está acotado superiormente por CL si j ∈ L. La regla (1.24) dice que algunas variables deben ser impuestas como cero antes de resolver el PL. Esto es válido ya que si pij + rij > τℓ entonces el trabajo j no podrá terminar de procesarce antes que τℓ en la máquina i, y por lo tanto yjiℓ será cero. Para obtener una solución aproximada de nuestro problema, primero resolvemos [HSSW] a optimalidad, llamando a las solución y ∗ , {CL∗ }L∈O . Luego, llevamos a cero todas las variables que asignan un trabajo j ∈ L a un intervalo posterior a 3/2CL∗ ,1 y reescalamos de manera tal que los trabajos sean totalmente asignados por la solución fraccionaria. Se puede demostrar que al hacer esto ninguna variable tuvo que incrementarse en más que un factor 3. Posteriormente aplicamos el Teorema 1.3, interpretando cada par intervalo-máquina de nuestras variables como una máquina del teorema, obteniendo ası́ una asignación entera de trabajos a pares intervalo-máquina. Podemos notar que gracias a la Ecuación (1.24) y (1.14) 1 El número 3/2 es elegido tal que el algoritmo final de el mejor factor de aproximación posible. xxiv la solución no empeora más que en un factor constante al aplicar el redondeo del Teorema 1.3. Concluimos el algoritmo asignando trabajos de manera glotona como sigue. Dentro de cada máquina, para todo ℓ = 1, . . . , q, procesamos todos los trabajos que están asignados al ℓ-ésimo intervalo lo antes posibles, y ordenando de manera arbitraria si es que habı́a mas de un trabajo asignado a un cierto intervalo. Se puede probar que al aplicar este algoritmo cada trabajo j ∈ L termina de procesarse antes de 27/2CL∗ . Obtenemos el siguiente resultado. Teorema 1.5. Existe un algoritmo de 27/2-aproximación para R|rij | 1.6.3 Capı́tulo 5: Un PTAS para minimizar paralelas P P wL CL . wL CL en máquinas P En este capı́tulo diseñamos un PTAS para algunas versiones restringidas de P |part| wL CL . Asumimos que hay un número constante de máquinas, un número constante de trabajos por orden, o un número constante de órdenes. Primero describimos el caso donde el número de trabajos por orden esta acotado por una constante K, y luego justificaremos porque esto implica la existencia de PTASes para los otros casos. Los resultados en este capı́tulo siguen P muy de cerca el PTAS para P |rj | wj Cj desarrollado por Afrati et al. [1]. Como es usual en el diseño de un PTAS, la idea general consiste en añadir estructura a la solución, modificando la instancia de tal manera que el costo de la solución óptima no empeore más que en un factor (1 + ε). Además, aplicando varias modificaciones a la solución óptima de esta nueva instancia, probaremos que existe una solución casi-óptima que satisface varias propiedades extras. La estructura otorgada por estas propiedades nos permitirán encontrar esta solución haciendo busquedas exhaustivas o programación dinámica. Como cada una de las modificaciones que aplicaremos a la solución óptima solo genera una perdida de un factor (1 + ε) al costo, podemos aplicar una cantidad constante de ellas, obteniendo una solución que está a un factor (1+ε)O(1) del costo óptimo. Luego, escogiendo ε suficientemente pequeño podemos aproximar a un factor arbitrareamente cercano a 1. Como en los capı́tulos 3 y 4, divideremos el horizonte de tiempo en intervalos que crecerán exponencialmente. Para cada entero t, denotaremos por It el intervalo [(1 + ε)t , (1 + ε)t+1 ), y llamaremos a |It | a el tamaño de tal intervalo, i.e. |It | = ε(1 + ε)t . Una de las técnicas principales que ocuparemos es la de “estiramiento”, que consiste en estirar el eje de tiempo en un factor (1 + ε). Claramente, esto solo empeora la solución en un factor de (1 + ε). Las dos técnicas básicas de estiramientos son: xxv 1. Estirar Tiempos de Completación Este procedimiento consiste en retrasar cada trabajo, tal que el tiempo de completación de un trabajo j se vuelve Cj′ = (1 + ε)Cj en la nueva programación de tareas. Es fácil de verificar que este procedimiento genera un tiempo muerto de εpj previo a cada trabajo j. 2. Estirar Intervalos: El objetivo de este procedimiento es crear tiempo muerto en cada intervalo, excepto por aquellos que tienen un trabajo que los cubren completamente. Como antes, consiste en desplazar los trabajos hasta el siguiente intervalo. Más precisamente, si el trabajo j termina en It y ocupa dj unidades de tiempo en It , moveremos j a It+1 desplazandolo en exactamente |It | unidades de tiempo, tal que ocupe dj unidades de tiempo en It+1 . Luego, el tiempo de completación de la nueva solución será a lo más (1 + ε)Cj , y por lo tanto el costo total de la solución se incrementará en a los más un factor (1 + ε). Notemos que si j parte siendo procesado en It donde es procesado por dj unidades de tiempo, después de aplicar el desplazamiento será procesada en It+1 en a lo más dj unidades de tiempo. Ya que It+1 tiene ε|It | = ε2 (1 + ε)t más unidades de tiempo que It , al menos esa cantidad de tiempo muerto será creado en It+1 . Además, podemos asumir que este tiempo muerto es consecutivo en cada intervalo. En efecto, esto se puede lograr moviendo a la izquierda lo más posible todos los trabajos que son programados completamente dentro de un intervalo. Antes de dar una descripción general del algoritmo, presentamos un teorema que asegura la existencia de una solución (1 + ε)-aproximada donde ninguna orden cruza más que O(1) intervalos. Para esto, primero mostramos la siguiente propiedad básica, la cuál está planteada en el caso más general de máquinas no relacionadas. P Lema 1.6. Para cualquier instancia de R|part| wL CL existe una solución óptima tal que: 1. Para cada orden L ∈ O y para cada máquina i = 1, . . . , m, todos los trabajos en L asignados a la máquina i son procesados de manera consecutiva. 2. La secuencia en la cual las órdenes son procesadas en cada máquina es independiente de la máquina. Lema 1.7. Sea s := ⌈log(1 + 1/ε)⌉, luego existe una programación de tareas (1 + ε)aproximada en la cual cada orden es completamente procesada en a lo más s + 1 intervalos consecutivos xxvi En lo que sigue describimos la idea general del PTAS. Dividimos el horizonte de tiempo en bloques de s+1 = ⌈log(1+1/ε)⌉+1 intervalos, y denotemos por Bℓ el bloque [(1+ε)ℓ(s+1) , (1+ ε)(ℓ+1)(s+1) ). El Lema 1.7 sugiere optimizar cada bloque por separado, y posteriormente juntar las soluciones de cada bloque para construir la solución global. Ya que pueden haber órdenes que cruzan de un bloque al siguiente, será necesario perturbar la “forma” de los bloques. Para ello introducimos el concepto de “frontera”. La “frontera saliente” de un bloque Bℓ es un vector con m entradas, tal que su i-ésima coordenada contiene el tiempo de completación del último trabajo procesado en la máquina i entre los trabajos pertenecientes a órdenes que comienzan a procesarse en Bℓ . Por otro lado, la “frontera entrante” de un bloque es la frontera saliente del bloque anterior. Dado un bloque, una frontera entrante y una frontera saliente, diremos que una orden es procesada dentro del bloque Bℓ si en cada máquina todos los trabajo en esa orden empiezan a procesarse después de la frontera entrante y terminan de procesarse antes de la frontera saliente. Asumamos momentaneamente que sabemos como calcular una solución (1+ε)-aproximada para un subconjunto dado de órdenes V ⊆ O dentro de un bloque Bℓ , con fronteras entrante y saliente F ′ y F respectivamente. Sea W (ℓ, F ′ , F, V ) el costo (suma ponderada de tiempos de completación de órdenes) de esta solución. Sea Fℓ el conjunto de posibles fronteras entrantes del bloque Bℓ . Usando programación dinámica podemos llenar una tabla T (ℓ, F, U ) que contiene el costo de una solución casióptima para el subconjunto de órdenes U ⊆ O en el bloque Bℓ o antes, respetando la frontera saliente, F , de Bℓ . Para calcular esta cantidad podemos usar la siguiente formula recursiva: T (ℓ + 1, F, U ) = min F ′ ∈Fℓ ,V ⊆U {T (ℓ, F ′ , V ) + W (ℓ + 1, F ′ , F, U \ V )}. Desafortunadamente, la tabla T no contiene una cantidad polinomial de entradas, ni siquiera finita. Luego, es necesario reducir su tamaño de la misma manera que en [1]. Con esto en mente el esquema del algoritmo es como sigue. Algoritmo: PTAS-DP 1. Localización: En este paso acotamos el perı́odo de tiempo en el cual cada orden puede ser procesada. Damos estructura extra a la instancia, definiendo un instante de disponibilidad rL para cada orden L, tal que existe una solución (1 + ε)-aproximada donde cada orden comienza a procesarse después de rL y termina de procesarse antes de un cierto número constante de intervalos después de rL . Esto juega un rol crucial en el próximo paso. xxvii 2. Representación polinomial de subconjuntos de órdenes: El objetivo de este paso es el de reducir el número de subconjuntos de órdenes que necesitamos considerar en la programación dinámica. Para ello, para todo ℓ definimos un subconjunto de tamaño polinomial Θℓ ⊆ 2O de posibles subconjuntos de órdenes que son procesadas en Bℓ o antes en alguna solución casi-óptima. 3. Representación polinomial de fronteras: En este paso reducimos el número de fronteras que debemos considerar en la programación dinámica. Para cada ℓ, encontramos Fbℓ ⊂ Fℓ , un conjunto de tamaño polinomial tal que en cada bloque la frontera saliente en una solución casi-óptima pertenece a Fbℓ . 4. Programación dinámica: Para todo ℓ, F ∈ Fbℓ+1 , U ∈ Θℓ calculamos: T (ℓ, F, U ) = min bℓ ,V ⊆U,V ∈Θℓ−1 F ′ ∈F {T (ℓ − 1, F ′ , V ) + W (ℓ, F ′ , F, U \ V )}. Es claro que no es necesario calcular exactamente W (ℓ, F ′ , F, U \ V ); una (1 + ε)aproximación de este valor, que mueve la frontera en a lo más un factor (1 + ε), es suficiente. Para calcular esto, particionamos las órdenes en pequeñas y grandes. Para los órdenes grandes usamos enumeración, y esencialmente tratamos cada posible programación de tareas, mientras que para las órdenes pequeñas las procesamos de manera glotona. Una de las mayores dificultades de este enfoque es que todas las modificaciones aplicadas a la solución óptima deben conservar las propiedades dadas por el Lema 1.6. Esto es necesario para describir la interacción entre un bloque y el siguiente usando solo el concepto de frontera. En otras palabras, si esto no fuera cierto podrı́a pasar que algún trabajo de una orden que comienza a procesarse en un bloque Bℓ sea procesado después de un trabajo que pertenece a una orden que comienza en el bloque Bℓ+1 . Esto incrementarı́a la complejidad del algoritmo, ya que esta interacción tendrı́a que ser considerada en la programación dinámica. Esta es la principal razón por la cuál nuestro resultado no se generaliza de manera directa al caso en donde tenemos tiempos de disponibilidad no triviales, ya que en este caso el Lema 1.6 no se satisface. Aplicando cuidadosamente estas ideas, se pueden concluir los siguientes teoremas. Teorema 1.8. Algoritmo: PTAS-DP es un esquema de aproximación a tiempo polinomial P para P |part| wL CL cuando el número de trabajos por orden esta acotado por una constante xxviii K. Teorema 1.9. Algoritmo: PTAS-DP es un esquema de aproximación a tiempo polinomial P para P m|part| wL CL . Teorema 1.10. Algoritmo: PTAS-DP es un esquema de aproximación a tiempo polinoP mial para P |part| wL CL cuando el número de órdenes es constante. 1.7 Conclusiones En esta memoria estudiamos problemas de programación de tareas con el objetivo de minimizar la suma ponderada de tiempos de completación de órdenes. En el Capı́tulo 3 comenzamos estudiando el caso particular de minimizar el makespan en máquinas no relacionadas. Mostramos como una simple técnica de redondeo puede transformar una solución con trabajos interrumpibles a una en donde ningún trabajo es interrumpido, tal que el makespan aumenta en a lo más un factor 4. Luego, probamos que esta resultado es lo mejor que se puede alcanzar, por medio de construir una familia de instancias que casi alcanzan esta cota. P En el Capı́tulo 4 presentamos algoritmos de aproximación para R|rij | wL CL y su verP sión con trabajos interrumpibles R|rij , pmpt| wL CL . Ambos algoritmos son basados en una técnica de redondeo muy similar a la desarrollada en el Capı́tulo 3 para minimizar el makespan. Además, cada algoritmo es el primero en tener un factor de aproximación constante para cada problema. Sin embargo, todavia quedan varias preguntas abiertas. En primer lugar, podriamos preguntarnos si es que las técnicas de redondeo ocupadas en cada algoritmo pueden ser mejoradas. A primera vista el paso que parece más factible de mejorar corresponde a cuando los valores de y son llevados a cero si es que asignan un trabajo a un intervalo muy tardı́o. Aunque no es una demostración, en el Capı́tulo 3 mostramos que una técnica muy similar da un redondeo que no puede ser mejorado. P Recordemos que el mejor resultado sobre la dificultad de aproximar R|| wL CL deriva del hecho que es N P-duro aproximar R||Cmax a un factor mejor que 3/2. Considerando que el algoritmo dado en este escrito asegura una garantı́a de 27/2, serı́a interesante el disminuir esta diferencia. Dado la generalidad de nuestro modelo, parece ser más fácil hacer esto dando una reducción diseñada especı́ficamente para nuestro problema, demostrando que nuestro problema es N P-duro de aproximar a un factor α > 3/2. P En el Capı́tulo 5 dimos un PTAS para P |part| wL CL , cuando el número de trabajos por orden, el número de órdenes o el número de máquinas son constantes. Esto generaliza varios xxix P PTASes previamente conocidos, como por ejemplo los PTAS para P ||Cmax y P || wj Cj . Sin P embargo, serı́a interesante el responder la pregunta de si el caso más general P |part| wL CL es APX-duro o no. Finalmente, otra posible dirección para continuar esta investigación es el de considerar nuestro problema en el caso en lı́nea. En esta variante las órdenes llegan a través del tiempo, y ningún tipo de información es conocida sobre ellas antes de su fecha de disponibilidad. En problemas en lı́nea estamos interesados en comparar el costo de nuestra solución con la solución óptima del caso en donde toda la información es conocida desde el tiempo 0. Con este objetivo, la noción de α-points (ver por ejemplo [18, 5, 35, 10]) ha demostrado ser útil para el problema de minimizar la suma ponderada de tiempos de completación de trabajos, y por ende serı́a interesante el estudiar está técnica para nuestro caso más general en la presencia de órdenes. xxx Chapter 2 Introduction 2.1 Machine scheduling problems Machine scheduling problems deal with the allocation of scarce resources over time. They arise in several and very different situations, for example, a construction site where the boss has to assign jobs to each worker, a CPU that must process tasks asked by several users, or a factory’s production lines that must manufacture products for its clients. In general, an instance of a scheduling problem contains a set of n jobs J, and a set of m machines M where the jobs in J must be processed. A solution of the problem is a schedule, i.e., an assignment that specifies when and on which machines i ∈ M each job j ∈ J is executed. To classify scheduling problems we have to look at the different characteristics or attributes that the machines and jobs have, as well as the objective function to be optimized. One of these is the machine environment, or the characteristics of the machines on our model. For example, we can consider identical or parallel machines, where each machine is an identical copy of all the others. In this setting each job j ∈ J takes a time pj to be processed, independent of the machine in which is scheduled. On the other hand, we can consider a more general situation where each machine i ∈ M has a different speed si , and then the time that takes to process job j on it is inversely proportional to the speed of the machine. Additionally, scheduling problems can be classified depending on job’s characteristics. Just to name a few, our model may consider nonpreemptive jobs, i.e. jobs cannot be interrupted until they are completed, or preemptive jobs, i.e. jobs that can be interrupted at any time and later resumed on the same or in a different machine. 1 Also, we can classify problems depending on the objective function. One of the more naturals objective functions is to minimize the makespan, i.e., to minimize the point in time at which the last job finishes. More precisely, if for some schedule we define the completion time of a job j ∈ J, denoted as Cj , as the time where job j ∈ J finishes processing, then the objective is to minimize Cmax := maxj∈J Cj . Other classical example consists on minimizing the number of late jobs. In this setting, each job j ∈ J has a deadline dj and the objective is to minimize the number of jobs that finish processing after its deadline. As these, there are several other different objective functions that can be considered. A large amount of scheduling problems can be consired by combining the characteristics just mentioned. So, it becomes necessary to introduce a standard notation for all these different problems. For this, Grahams, Lawler, Lenstra and Rinnooy Kan [20], introduced the “three field notation”, where a scheduling problem is represented by an expression of the form α|β|γ. Here, the first field α denotes the machine environment, the second field β contains extra constrains or characteristics of the problem, and the last field γ denotes the objective function. In the following we describe the most common values for α, β and γ. 1. Values of α. • α = 1 : Single Machine. There is only one machine at our disposal to process the jobs. Each job j ∈ J takes a given time pj to be processed. • α = P : Parallel Machines. We have a number m of identical or parallel machines to process the jobs. Then, the processing time of job j is given by pj , independently of the machine where job j is processed. • α = Q: Related Machines. In this setting each machine i ∈ M has a speed si associated. Then, the processing time of job j ∈ J on machine i ∈ M equals pj /si , where pj is the time it takes to process j in a machine of speed 1. • α = R: Unrelated Machines. In this more general setting there is no a priori relation between the processing times of jobs on each machine, i.e., the processing time of job j ∈ J on machine i ∈ M is an arbitrary number denoted by pij . Additionally, in the case that α = P, Q or R, we can add the letter m at the end of the field indicating that the number of machines m is constant. Then, for example, if under a parallel machine environment the number of machines is constant, then α = P m. The value of m can also be specified, e.g., α = P 2 means that there are exactly 2 parallel machines to process the jobs. 2 2. Values of β. • β = pmtn: Preemptive Jobs. In this setting we consider jobs that can be preempted, i.e., jobs that can be interrupted and resume later on the same or on a different machine. • β = rj : Release Dates. Each job j ∈ J has associated a release date rj , such that j cannot start processing before that time. • β = prec: Precedence Constrains. Consider a partial order relation over the jobs (J, ≺). If for some pair of jobs j y k, j ≺ k, then k must start processing after the completion time of job j. 3. Values of γ. • γ = Cmax : Makespan. The objective is to minimize the makespan Cmax := maxj∈J Cj . P • γ = Cj : Average Completion Times. We must minimize the average of the P completion times, or equivalently j∈J Cj . P • γ = wj Cj : Sum of weight Completion Times. Consider a weight wj for each j ∈ J. Then, the objective is to minimize the sum of weighted completion time P j∈J wj Cj . It is worth noticing that by default we consider nonpreemptive jobs. In other words, P if the field β is empty, then jobs cannot be preempted. For example, R|| wj Cj denotes the problem of finding a nonpreemptive schedule of a set of jobs J on a set of machines M , where each job j ∈ J takes pij units of time to process in machine i ∈ M , minimizing P P wj Cj denotes the same problem as before, with j∈J wj Cj . As a second example, R|rj | the only difference that a job j can only start processing after rj . Also, note that the field β P can take more than just one value. For example, R|prec, rj | wj Cj is the same as the last problem, but adding precedence constrains. Over all scheduling problems, most non-trivial ones are N P-hard and therefore there is no polynomial time algorithm to solve them unless P = N P. In particular, as we will show later, one of the fundamental problems in scheduling, P 2||Cmax , can be easily proven N P-hard. In the following section we describe some general techniques to address N P-hard optimization problems and some basic applications to scheduling. 3 2.2 Approximation algorithms The introduction of the N P-complete class given by Cook [11], Karp [24] and independently Levin [31], left big challenges about how these problems could be tackle given their apparent intractability. One option that has been widely studied is the use of algorithms that completely solves the problem, but has no polynomial upper bound on the running time. This kind of algorithm can be useful in small to medium instances, or in instances with some special structure where the algorithm runs fast enough in practice. Nevertheless, there may be other instances where the algorithm takes exponential time to finish, becoming impractical. The most commons of this approaches are Branch & Bound, Branch & Cut and Integer Programming techniques. For the special case of N P-hard optimization problems, another alternative is to use algorithms that runs in polynomial time, but may not solve the problem to optimality. Among this kind of algorithms, a particularly interesting class is “approximation algorithms”, i.e., algorithms in which the solution is guaranteed to be, in some sense, close to the optimal solution. More formally, let us consider a minimization problem P with cost function c. For α ≥ 1, we say that a solution S to P is an α-approximation if it cost c(S) is within a factor α from the cost of the optimal OP T , i.e., if c(S) ≤ α · OP T. (2.1) Now, consider a polynomial-time algorithm A whose output over instance I is A(I). Then, A is an α-approximation algorithm if for any instance I, A(I) is an α-approximation. The number α is called the approximation factor of algorithm A, and if α does not depends on the input we say the A is a constant factor approximation algorithm. Analogously, if P is a maximization problem with objective function c, a solution S is an α-approximation, for α ≤ 1, if c(S) ≥ α · OP T. As before, for α ≤ 1, an algorithm A is an α-approximation algorithm if A(I) is an αapproximation for any instance I. On the remaining of this document we will only study minimization problems, and therefore we will not use this definition. One of the firsts approximation algorithm for an N P-hard optimization problem was presented by R.L. Graham [19] in 1966, even before the notion of N P-completeness was 4 formally introduced. Graham studied the problem of minimizing the makespan on parallel machines, P ||Cmax . He proposed a greedy algorithm consisting on: (1) Order the jobs arbitrarily, (j1 , . . . , jn ); (2) For k = 1, . . . , n, schedule job jk on the machine where it would begin processing first. Such a procedure is called a list-scheduling algorithm. Lemma 2.1 (Graham 1966 [19]). List-scheduling is a (2 − 1/m)-approximation algorithm for P ||Cmax . Proof. First notice that if OP T denotes the makespan of the optimal solution, then OP T ≥ 1 X pj , m j∈J (2.2) since otherwise the total amount of machine time needed to process all jobs would be less P than j∈J pj . Let ℓ be such that Cjℓ = Cmax , and denote Sj = Cj − pj the starting time of a job j ∈ J. Then, noting that at the ℓ-th step of the algorithm all machines were busy at time Sjℓ , ℓ−1 1 X pj , S jℓ ≤ m k=1 k and therefore, ℓ Cmax 1 X 1 = S jℓ + p jℓ ≤ pjk + (1 − )pjℓ ≤ m k=1 m 1 2− m OP T, (2.3) where the last inequality follows from (2.2) and the fact that pjℓ ≤ OP T , since no schedule can finish before pj for any j ∈ J. As we could observe, a crucial step in the previous analysis is to obtain a good lower bound on the optimal solution (for example Equation (2.2) in last lemma), to then use it to upper bound the solution given by the algorithm (as in Equation (2.3)). Most techniques to find lower bounds are problem specific, and therefore is hard to give general rules of how to find them. One of the few exceptions that has been proven useful in a widely variety of problem, consists on formulating the optimization problem as a integer program, and later relax its integrality constrains. Clearly, the optimal solution of the relaxed problem must be a lower bound on the optimal solution of the original problem. An algorithm that uses this technique is called a LP-based approximation algorithm. To illustrate this idea, consider the following problem. 5 Minimum Cost Vertex-Cover: Input: A graph G = (V, E), and a cost function c : V → Q over the vertices. Objective: Find a vertex-cover, i.e., a set B ⊆ V that intersects every edge in E, P minimizing the cost c(B) = v∈B c(v). It is easy to see that this problem is equivalent to the following integer program: [LP] min X yv c(v) (2.4) v∈V yv + yw ≥ 1 for all vw ∈ E, (2.5) for all v ∈ V. (2.6) yv ∈ {0, 1} Therefore, by replacing Equation (2.6) by yv ≥ 0, we obtain a linear program whose optimal value is a lower bound on the optimal of the Minimum Cost Vertex-Cover problem. To get a constant factor approximation algorithm, we proceed as follows. First solve [LP] (by, for example, using the ellipsoid method), and call the solution yv∗ . To round this fractional solution first note that Equation (2.5) implies that for every edge vw ∈ E either yv∗ ≥ 1/2 or yw∗ ≥ 1/2. Then, the set B = {v ∈ V |yv∗ ≥ 1/2} is a vertex-cover, and furthermore we can bound its cost as, c(B) = X v:yv∗ ≥1/2 c(v) ≤ 2 X v∈V yv∗ c(v) ≤ 2OP TLP ≤ 2OP T, (2.7) where OP T denotes the cost of the optimum solution of the vertex-cover problem and OP TLP is the solution of [LP]. Thus, the algorithm just described is a 2-approximation algorithm. Noting that OP T ≤ c(B), Equation (2.7) implies that OP T ≤ 2, OP TLP for any instance I of the Minimum Cost Vertex-Cover. More generally, any α- approximation algorithm that uses OP TLP as a lower bound must satisfy max I OP T ≤ α. OP TLP The left hand side of this last equation is called the integrality gap of the linear program. 6 Finding a lower bound on the integrality gap is a common technique to see what is the best approximation factor that a linear program can yield. To do this we just need to find a instance with a large ratio OP T /OP TLP . For example, is easy to show that the rounding we just described for Minimum Cost Vertex-Cover is best possible. Indeed, considering the graph G as the complete graph of n vertices and the cost function c ≡ 1, we get that OP T = n − 1 and OP TLP = n/2, and thus OP T /OP TLP → 2 when n → ∞. 2.3 Polynomial time approximation schemes For a given N P-hard problem, it is natural to ask what is the best possible approximation algorithm in term of its approximation factor. Clearly, this depends on the problem. On one side, there are some problems that do not admit any kind of approximation algorithms unless P = N P. For example, the travelling salesman problem with binary costs cannot be approximated up to any factor. Indeed, if there exists an α-approximation algorithm for this problem, then we can use it to decide whether exists or not a hamiltonian circuit of cost zero: If the optimum solution is zero, then the approximation algorithm must return zero by (2.1), independently of the value of α; If the optimum solution is greater than zero then the algorithm will also return a solution with cost greater than zero. On the other hand, there are some problems that admit arbitrarily good approximation algorithms. To formalize this idea we define a polynomial time approximation scheme (PTAS) as a collection of algorithms {Aε }ε>0 such that each Aε is a (1 + ε)-approximation algorithm that runs in polynomial time. Let us remark that ε is not considered as part of the input, and therefore the running time of the algorithm could depend exponentially on ε. A common technique to find a PTAS is to “round” the instance such that the solution space is significantly decreased, but the value of the optimal solution is only slightly changed. Later, we can use exhaustive search or dynamic programming to find an optimal or nearoptimal (i.e. a (1 + ε)-approximation) solution to the rounded problem. To obtain an almost-optimal solution to the original problem, we transform the solution of the rounded instance without increasing the cost in more than a 1 + O(ε) factor. We briefly show this technique by applying it to P 2||Cmax , i.e. the problem of minimizing the makespan on two parallel machines. Consider a fixed 0 < ε < 1, and call OP T the makespan of the optimal solution. We will show how to find a schedule of makespan less than (1+ε)2 OP T ≤ (1+3ε)OP T , which is enough by redefining ε ← ε/3. Begin by rounding 7 up the values of each pj to powers of (1 + ε), pj ← (1 + ε)⌈log1+ε pj ⌉ . With this, the processing time of each job is increased in at most a (1+ε) factor, and so is the optimal makespan. In other words, by denoting OP Tr the optimal makespan of the rounded instance, OP Tr ≤ (1 + ε)OP T . Then, it would be enough to find a (1 + ε)-approximation of the rounded instance, since using that assignment of jobs to machines on the original problem would only decreases the makespan of the solution, thus yielding a (1 + ε)2 -approximation. For this, let P = maxj pj , and define a job to be “big” if pj ≥ εP and “small” otherwise. Thanks to our rounding, the amount of different values the processing time of a big job can take is less than ⌊log1+ε 1/ε⌋+1 = O(1). Also, notice that a schedule of big jobs is determined by specifying how many jobs of each size are assigned to each of the two machines. Thus, we can enumerate all schedules of big jobs in time n⌊log1+ε 1/ε⌋+1 = nO(1) = poly(n), and take the one with the shortest makespan. To schedule small jobs, notice that a list-scheduling algorithm is enough: process each job one step at a time, in any order, on the machine that would finish first. Clearly, this yields a (1 + ε)-approximation for the rounded instance. Indeed, if after adding the small jobs the makespan was not increased, then the solution constructed is optimal. On the other hand, if adding the small jobs increased the makespan, then the difference between the makespan of both machines is less than εP ≤ εOP Tr . Therefore, the makespan of the solution constructed is less than (1 + ε)OP Tr ≤ (1 + ε)2 OP T . Thus, we can construct a (1 + ε)2 -approximation of the original problem in polynomial-time. Although the algorithm that we just showed runs in polynomial-time for any fixed ε, the running time increases exponentially when ε decreases. Thus, we may ask if we can do even better, e.g., if we can find a PTAS for which the running time is also polynomial in ε. Such a scheme is called a fully polynomial time approximation schemes(FPTAS). Unfortunately, there are only few problems that admits an FPTAS. Indeed, it can be shown that any strongly N P-hard problem cannot admit a FPTAS, unless P = N P (see for example [42] Ch. 8). In the next section we will describe the problem that we are going to work on this thesis. Not surprisingly the problem is N P-hard, and thus the tools discussed in this and in the previous sections will be helpful to study it. 8 2.4 Problem definition In this writing we study a natural scheduling problem arising in manufacturing environments. Consider a setting where clients place orders, consisting of one or more products, to a given manufacturer. Each product has a machine dependant processing requirement, and has to be processed on any of m machines available for production. The manufacturer has to find a schedule so as to give the best possible service to its clients. In its most general form, the problem we consider is as follows. We are given a set of jobs S J and a set of orders O ⊆ P(J), such that L∈O L = J. Each job j ∈ J is associated with a value pij which represents its processing time on machine i, while each order L has a weight factor wL depending on how important it is for the manufacturer. Also, job j is associated with a machine dependant release date rij , so it can only start being processed on machine i by time rij . An order is completed once all its jobs have been processed. Therefore, if Cj denotes the point in time at which job j is completed, CL = max{Cj : j ∈ L} denotes the completion time of order L. The goal of the manufacturer is to find a nonpreemptive schedule in the m available machines so as to minimize the sum of weighted completion time of orders, i.e., X min wL CL . L∈O We refer to this objective function as the sum of weighted completion time of orders. Let us remark that in this general framework we are not restricted to the case where the orders are disjoint, and therefore one job may participate in the completion time of several orders. P To adopt the three field scheduling notation we denote this problem as R|rij | wL CL , or P R|| wL CL , in case all release dates are zero. When the processing times pij do not depend on the machine, we exchange the “R” by a “P ”. Also, when we impose the additional constraint that orders are disjoint subsets of jobs we will add part in the second field β of the notation. As will be showed later, our problem generalizes several classic machine scheduling probP P lems. Most notably, these include R||Cmax , R|rij | wj Cj and 1|prec| wj Cj . Since all of this are N P-hard in the strong sense (see for example [17]), then our more general setting also is. It is somewhat surprising that the best known approximation algorithms for all these problems have an approximation guarantee of 2 [4, 35, 37]. However, for our more general setting, no constant factor approximation is known. The best known result, due to Leung, Li, Pinedo and Zhang [29], is an algorithm for the special case of related machines (i.e., 9 pij = pj /si , where si is the speed of machine i) and without release dates on jobs. The approximation factor of the algorithm is 1 + ρ(m − 1)/(ρ + m − 1), where ρ is the ratio of the speed of the fastest machine to that of the slowest machine. In general this guarantee is not constant and can be as bad a m/2. 2.5 Previous work To illustrate the flexibility of our model, we now review some relevant scheduling models in different machine environments that lie in our framework. 2.5.1 Single machine We begin by considering the problem of minimizing the sum of weighted completion time of orders on one machine. First we study the simply case where no job belongs to more P P than one order, 1|part| wL CL , showing that is equivalent to 1|| wj Cj . The later, as was shown by Smith [41], can be solved to optimality by scheduling jobs in non-increasing order of wj /pj . In the literature, this greedy algorithm is known as Smith’s rule. To see that the these two problems are indeed equivalent, we first show that there is a P optimal schedule of 1|part| wL CL where all jobs of an order L ∈ O are processed consecutively. To see this, consider an optimal schedule where this does not hold. Then, there exist jobs j, ℓ ∈ L and k ∈ L′ 6= L, such that k starts processing at Cj , and ℓ is processed after k. Thus, swapping jobs j and k, i.e. delaying j by pk units of time and bringing forward k by pj units of time, does not increase the cost of the solution. Indeed, job k decreases its completion time, and so CL′ is not increased. Also, order L does not increase its completion time since job ℓ ∈ L, which is always processed after j, remains untouched. By iterating this argument, we finish with a schedule where all jobs in an order are processed consecutively. P Therefore, each order can be seen as a larger job with processing time j∈L pj , and thus our P problem is equivalent to 1|| wj Cj . P We now consider the more general problem 1|| wL CL , where we allow jobs to belong to several orders at the same time. We will prove that this problem is equivalent to single P machine scheduling with precedence constraints denoted by 1|prec| wj Cj . Recall that in this problem there is a partial order over the jobs meaning that, if j k, then job j must finish being processed before job k begins processing. If j k we say that j is a predecessor of k and k is a successor of j. This classic scheduling problem has attracted 10 much attention since the sixties. Lenstra and Rinnooy Kan [26] showed that this problem is strongly N P-hard even with unit weights or unit processing times. On the other hand, several 2-approximation algorithms have been proposed: Hall, Schulz, Shmoys & Wein [21] gave a LP-relaxation based 2-approximation, while Chudak & Hochbaum [6] proposed another 2-approximation based on a half-integral programming relaxation. Also, Chekuri & Motwani [4], and Margot, Queyranne & Wang [32] independently developed a very simple P combinatorial 2-approximation. Furthermore, the results in [2, 12] imply that 1|prec| wj Cj is a special case of vertex cover. However, hardness of approximation results where unknown until recently Ambuhl, Mastrolilli & Svensson [3] proved that there is no PTAS for this problem unless N P-hard problems can be solved in randomized subexponential time. P P We now show that 1|| wL CL and 1|prec| wj Cj are equivalent and therefore all results P known for the latter can be also be applied to 1|| wL CL . First, let us see that every αP P approximation for 1|prec| wj Cj implies an α-approximation for 1|| wL CL . Let I = (J, O) P be an instance of 1|| wL CL , where J is the job set and O the set of orders. We construct P an instance I ′ = (J ′ , ) of 1|prec| wj Cj as follows. For each job j ∈ J there is a job j ′ ∈ J ′ with pj ′ = pj and wj ′ = 0. Also, for every order L ∈ O we will consider an extra job j(L) ∈ J ′ with processing time pj(L) = 0 and weight wj(L) = wL . The only precedence constrains that we will impose will be that j ′ j(L) for all j ∈ L and every L ∈ O. Since pj(L) = 0, we can restrict ourselves to schedules of I ′ where each j(L) is processed when the last job of L is completed. Thus, it is clear that the optimal solutions to both problems have the same total P cost. Furthermore, it is straightforward to note that given an algorithm for 1|prec| wj Cj (approximate or not) we can simply apply it to instance I ′ above and impose that j(L) is processed exactly when the last job of L is completed, without a cost increase. The resulting P schedule for I ′ can then be directly applied to the original instance I of 1|| wL CL and its cost will remain the same. P To see the other direction, let I = (J, ) be an instance of 1|prec| wj Cj . To construct P an instance I ′ = (J ′ , O) of 1|| wL CL , consider the same set of jobs J ′ = J and for every job j ∈ J ′ , we let L(j) ∈ O be the order {k ∈ J : k j}, and let wL(j) = wj . With this construction the following lemma holds. Lemma 2.2. Any schedule of I ′ can be efficiently transformed into a schedule of the same instance, respecting the underlying precedence constraints and without increasing the cost. Proof. Let k be the last job that violates a precedence constrain, and let j be the last job that is a successor of k but is scheduled before k. We will show that delaying job j right after 11 j 111111111111 000000000000 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 11111111111 00000000000 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 k k j Figure 2.1: Top: Original schedule. Bottom: Schedule after delaying j. job k (see Figure 2.1) does not violate any new precedence constrain, and does not increase the total cost. Indeed, if moving j after k violates a precedence constrains then there exists a job j ′ that was originally processed between j and k, such that j j ′ . Thus k j ′ , contradicting the choice of j and k. Also, note that every job but j diminishes its completion time. Furthermore, the completion time of each order containing j is not increased, since each such order also contained job k and the completion time of j in the new schedule will be the same as the completion of k in the old schedule. P With this lemma we conclude that the optimal schedule for instance I of 1|prec| wj Cj P has the same cost as that for instance I ′ of 1|| wL CL . Moreover, any α-approximate P schedule for instance I ′ of 1|| wL CL can be transformed into a schedule for instance I of P 1|prec| wj Cj of the same cost. Thus, the following holds. P P Theorem 2.3. The approximability thresholds of 1|prec| wj Cj and 1|| wL CL coincide. 2.5.2 Parallel machines In this section we talk about scheduling on parallel machines, where the processing time of each job j, pij = pj does not depend on the machine where is processed. Recall the previously defined problem of minimum makespan scheduling on parallel machines, P ||Cmax , which consists in finding a schedule of n jobs in m parallel machines, so as to minimize the maximum completion time. Notice that if in our setting O only contains one order, then the objective function becomes maxj∈J Cj = Cmax , and therefore P ||Cmax is P a special case of P || wL CL , which at the same time is a special case of our more general P model R|rij | wL CL . 12 The problem P ||Cmax has been a classical machine scheduling problem. It can be easily proven N P-hard, even for 2 machines. Indeed, consider the 2Partition problem where, for a given multiset of positive integers S = {a1 , . . . , an }, we must decide whether exists a S P P P partition R, T ⊆ A, R · T = A such that j∈S aj = j∈R aj = 1/2 j∈A aj . Then, for a given multiset S, consider n jobs where job j = 1, . . . , n has processing time pj = aj . Then, finding the minimum makespan schedule on two parallel machines would let us solve P 2Partition: the minimum makespan equals 1/2 j∈J pj if and only if there exist sets S J1 , J2 ⊆ J, J1 · J2 = J, corresponding to the set of jobs processed in each machine, such P P P that j∈J1 pj = j∈J2 pj = 1/2 j∈J pj . And thus, since 2Partition is N P-complete [24, 17], we conclude that P 2||Cmax is N P-hard. On the other hand, as showed in Lemma 2.1, a list-scheduling approach yields a 2-approximation algorithm. Furthermore, Hochbaum and Shmoys [22] presented a PTAS for the problem (see also [42, Chapter 10]). On the other hand, when on our model each order only contains one job, the problem P becomes equivalent to minimize the sum of weighted completion times of jobs j∈J wj Cj . Thus, in this case, the parallel machine version of our problem with no release dates becomes P P || wj Cj . The study of this problem also goes back to the sixties (see [9] for an early treatment). As in the makespan case, the problem becomes N P-hard already for two machines. On the other hand, a sequence of approximation algorithms had been proposed until Skutella and Woeginger [40] found a PTAS for the problem. Later, Afrati et al. [1] extended this result for the case on non-trivial release dates. P A natural question is thus to ask if there exists a PTAS for P |part| wL CL exists (notice P that, as shown in Section 2.5.1, the slightly more general problem P || wL CL is unlikely to have a PTAS). Although we do not know whether the latter holds, Leung, Li, and Pinedo [28] (see also Yang and Posner [44]) presented a 2-approximation algorithm for this problem. We briefly give an alternative analysis of Leung et al.’s algorithm by using a classic linear programming framework, first developed by Queyranne [33] for the single machine problem. Let Mj be the midpoint of job j in a given schedule, in other words, Mj = Cj − pj /2. Eastman et al. [15] implicitly showed that for any set of jobs S ⊆ J and any feasible P 2 schedule in m parallel machines, then the inequality: j∈S pj Mj ≥ p(S) /2m is satisfied, P where p(S) = j∈S pj . These inequalities are called the parallel inequalities. It follows that if OPT denotes the value of an optimal schedule, then OPT is lower bounded by the following linear program: X [LP] min wL CL L∈O 13 X j∈S CL ≥ Mj + pj /2 for all L ∈ O and j ∈ L, pj Mj ≥ p(S)2 /2m for all S ⊆ N. Queyranne [33] showed that [LP] can be solved in polynomial time since separating the parallel inequalities reduces to submodular function minimization. Let M1∗ , . . . , Mn∗ be an optimal solution and assume without loss of generality that M1∗ ≤ M2∗ ≤ · · · ≤ Mn∗ . Clearly, CL∗ = max{Mj∗ + pj /2 : j ∈ L}, so the optimal solution is completely determined by the M values. Consider the algorithm that first solves (LP) and then schedules jobs using a list-scheduling algorithm according to the order M1∗ ≤ M2∗ ≤ · · · ≤ Mn∗ . Let CjA denote the completion time of job j in the schedule given by the algorithm, so that CLA = max{CjA : j ∈ L}. It is easy to see that CjA equals the time at which job j is started by the algorithm, SjA , plus pj . Furthermore, at any point in time before SjA all machines were busy processing jobs in {1, . . . , j − 1}, thus SjA ≤ p({1, . . . , j − 1})/m. It follows that CLA Also, Mj∗ p({1, . . . , j}) ≥ ≤ max P j∈L l∈{1,...,j} CL∗ p({1, . . . , j − 1}) + pj . m pl Ml∗ ≥ p({1, . . . , j})2 /2m. Then, ≥ max j∈L p({1, . . . , j}) pj + 2m 2 . We conclude that CLA ≤ 2CL∗ which implies that the algorithm returns a solution which is within a factor of 2 of OPT. Furthermore, note that this approach not only works for P P P |part| wL CL but also for P || wL CL . 2.5.3 Unrelated machines In the unrelated machine setting, our problem is also a common generalization of some classic machine scheduling problems. As before, if there is a single order and rij = 0, our problem becomes minimum makespan scheduling (R||Cmax ), in which the goal is to find a schedule of the n tasks in m unrelated machines so as to minimize the makespan. In a seminal work, Lenstra, Shmoys and Tardos [27] give a 2-approximation algorithm for R||Cmax , and showed that it is N P-hard to approximate it within a constant better than 3/2. Thus, the same 14 hardness result holds for R|| P wL CL . On the other hand, if orders are singletons and rij = 0, our problem becomes minimum P sum of weighted completion times scheduling (R|| wj Cj ). In this setting each job j ∈ J is associated, with a processing time pij , and a weight wj . The goal is to find a schedule so as to minimize sum of weighted completion times of jobs. As in the makespan case, the latter problem was shown to be APX-hard [23] and therefore there is no PTAS, unless P = N P. On the positive side, Schulz and Skutella [35] used a linear program relaxation to design an approximation algorithm with performance guarantee of 3/2 + ε in the case without release dates, and 2 + ε in the more general case. Furthermore, Skutella [38] refined this result by means of a convex quadratic programming relaxation obtaining a 3/2-approximation algorithm in the case of trivial release dates, and a 2-approximation algorithm in the more general case. Finally, it is worth mentioning that our problem also generalizes assembly scheduling P problems that have received attention recently, which we denote by A|| wj Cj (see e.g. [7, 8, 30]). As explained before, in this setting we are given a set M with m machines and a set of jobs J, with associated weights wj . Each job has m parts, one to be processed by each machine. So, pij denotes the processing time of the i-th part of job j, that must be processed on machine i. The goal is to minimize the sum of weighted completion times P ( wj Cj ), where the completion time Cj of job j is the time by which all of its parts have been processed. Thus, in our setting, a job with its m parts can be modelled as an order that contains m jobs. To ensure that each of the jobs on each order can only be processed on its correspondent machine, we give it infinity (or sufficiently large) processing time on all the others machines. Besides proving that the assembly line problem is N P-hard, Chen and Hall [7] and Leung, Li, and Pinedo [30] independently gave a simple 2-approximation algorithm based in the following linear programming relaxation of the problem: [LP] min X wj Cj j∈N X j∈S pij Cj ≥ pi (S)2 + p2i (S) /2 for all i = 1, . . . , m, S ⊆ N. Similarly to the 2-approximation described for P || 15 P wL CL in Section 2.5.2, the algorithm consists in processing jobs according to the order given by an optimal LP solution. Clearly, this is a 2-approximation. Indeed, consider C1 ≤ · · · ≤ Cn the optimal LP solution (after reordering if needed) and let S = {1, . . . , k}. Call C H and C ∗ the heuristic and the optimal P 2 completion time vectors respectively. Clearly, pi (S)Ck ≥ j∈S pij Cj ≥ pi (S) /2, hence P 2Ck ≥ pi (S) for all i ∈ M . It follows that CkH = max1≤i≤m pi (S) ≤ 2Ck , and then wj CjH ≤ P P 2 wj Cj ≤ 2 wj Cj∗ , and thus the solution constructed is an 2-approximation. 2.6 Contributions of this work P In this thesis we develop approximation algorithms for R|rij | wL CL and some of its particular cases. In Chapter 3 we begin by showing some techniques used in the subsequents sections. First, we review the result of Lawler and Labetoulle [25] showing that R|pmpt|Cmax , i.e. the problem of minimizing the makespan of preemptive jobs on unrelated machines, is polynomially solvable. Later, we propose a way of rounding any solution of R|pmpt|Cmax to a solution of R||Cmax , such that the cost of the solution is not increased in more than a factor of 4. For this we use the classic rounding technique of Shmoys and Tardos [37] for the generalized assignment problem. We conclude this chapter by showing that this rounding is best possible. To this end we construct a sequence of instances for which the ratio between its optimal preemptive makespan and its optimal nonpreemptive makespan is arbitrarily closed to 4. In Chapter 4 we generalize the techniques previously developed. We begin by giving a P (4 + ε)-approximation for R|pmpt, rij | wL CL , i.e. for each fixed ε > 0 we show a (4 + ε)approximation algorithm. The algorithm is based on a time-index linear program relaxation of the problem based on that of Dyer and Wolsey [13]. The rounding uses Lawler and Labetoulle’s [25] result, described in the previous chapter. Also we show a 27/2-approximation P algorithm for R|rij | wL CL . This is the first constant factor approximation algorithm for this problem, and thus improves the non-constant factor approximation algorithm for P Q|part| wL CL proposed by Leung et al. [29]. Our approach is based on an intervalindexed linear program proposed by Hall et al [21], and uses a very similar rounding to the one showed in Chapter 3. P In Chapter 5 we design a PTAS for P || wL CL , for the cases when the number of orders is constant, the number of jobs inside each order is constant, or the number of machines is constant. Our algorithm works in all three cases and thus generalizes the known PTASs 16 P in [1, 22, 40]. Our approach follows closely the PTAS of Afrati et al. [1] for P |rj | wj Cj . However, the main extra difficulty from that of Afrati et al. case, is that we might have orders that are processed through a long period of time, and its cost is only realized when it is completed. To overcome this issue, and thus be able to apply the dynamic programming ideas in [1], we simplify the instance and prove that there is a near-optimal solution in which every order is fully processed in a restricted time span. This requires some careful enumeration plus the introduction of artificial release dates. Finally, in Chapter 6 we summarize all the results, and then propose some possible directions for future investigation. 17 Chapter 3 On the power of preemption on R||Cmax In this chapter we study the problem of minimizing the makespan on unrelated machines, R||Cmax , that as was explained before, is a special case of our more general problem of minP imizing the sum of weighted completion time of orders on unrelated machines, R|| wL CL . The techniques in this chapter will give insight on how to give approximations algorithms for P P the more general problems R|rij | wL CL and R|rij , pmpt| wL CL . In Section 3.1, we begin by reviewing the technique developed by Lawler and Labetoulle [25] to solve R|pmtn|Cmax , that shows that this problem is equivalent to solving a linear program. In Section 3.2, we give a quick overview of Lenstra, Shmoys and Tardos’s [27] 2-approximation algorithm for R||Cmax , and discuss why it is difficult to apply those ideas to our more general setting. Then, we show how we can modify this result, getting one easier to generalize. By doing this we obtain a rounding that turns any preemptive schedule to a nonpremptive one, such that the makespan is not increased in more than a factor of 4. On the other hand, in Section 3.3, we prove that this factor is best possible, i.e. there is no rounding that converts a preemptive schedule to a nonpreemtive one with a guarantee better than 4. We achieve this by iteratively constructing a family of almost tight instances. 3.1 R|pmtn|Cmax is polynomially solvable We now present the algorithm developed by Lawler and Labetoulle, that computes the optimal solution of R|pmtn|Cmax . It is based on a linear programming formulation that uses 18 assignment variables xij , indicating the fraction of job j ∈ J that is processed on machine i ∈ M . With this, it will be enough to give a way of converting any feasible solution of this linear program to a preemptive schedule of equal makespan, i.e., we need to find a way of distributing the fractions of each job inside each machine, such that no two fraction of the same job are processed in parallel. More precisely, let us consider the following linear program, [LL] min C X xij = 1 for all j ∈ J, (3.1) pij xij ≤ C for all i ∈ M, (3.2) pij xij ≤ C for all j ∈ J, (3.3) xij ≥ 0 for all i, j. (3.4) i∈M X j∈J X i∈M It is clear that each preemptive schedule induces a feasible solution to [LL]. Indeed, given any preemptive solution, denote C its makespan and xij the fraction of job j that is processed on machine i. In other words, if yij denotes the amount of time that the schedule uses to process job j on machine i, then xij = yij /pij . With this definition, the solution must satisfy Equation (3.1) since every job is always completely scheduled. Furthermore, Equation (3.2) P is also satisfied since no machine i ∈ M can finish processing before j pij xij . Similarly, Equation (3.3) holds since no job j can be processed in two machines at the same time, and thus the left hand side of this equation is a lower bound on the completion time of job j. Let xij and C be any feasible solution of [LL]. Consider the following algorithm that creates a preemptive schedule of makespan C. Algorithm: Nonparallel Assignment 1. Define the values zij := pij xij /C, for all i ∈ M and j ∈ J. Note that the vector (zij )ij belongs to the matching polyhedron P , of all yij ∈ Rnm satisfying the following inequalities: 19 X i∈M X j∈J yij ≤ 1 for all j ∈ J, (3.5) yij ≤ 1 for all i ∈ M, (3.6) yij ≥ 0 for all i, j. (3.7) Also, note that P is integral, since the matrix that defines it is totally unimodular (see for example [34] Ch. 18). 2. Note that by Caratheodory’s theorem [14, 16] it is possible to decompose vector z as a convex combination of a polynomial number of vertices of P . More precisely, we can T find vectors Z k ∈ {0, 1}nm P and scalars λk ≥ 0 for k = 1, . . . mn + 1, such that P Pmn+1 k zij = nm+1 k=1 λk Zij and k=1 λk = 1. 3. Build the schedule as follows. For each i ∈ M, k = 1, . . . , nm + 1 such that Zijk = 1, Pk−1 P schedule job j in machine i, between time C ℓ=1 λℓ and C kℓ=1 λℓ . We first show the correctness of the algorithm, and later show that it can be execute in polynomial time. Lemma 3.1. Let us consider xij and C satisfying equations (3.2), (3.3) and (3.4). Algorithm: Nonparallel Assignment constructs a preemptive schedule of makespan at most C, where the fraction of job j ∈ J processed on machine i ∈ M is xij . Proof. First, note that for each i ∈ M and j ∈ J Algorithm: Nonparallel Assignment process job j during pij xij units of time in machine i. Indeed, for each k = 1, . . . , nm + 1, i ∈ M and j ∈ J such that Zijk = 1, the amount of time job j is processed on machine i equals Cλk . Then, since Z k is binary, the total amount of time job j is processed in machine i equals nm+1 X Cλk Zijk = Czij = pij xij . k=1 Then, the fraction of job j that is processed in machine i is xij . Furthermore, no job is processed in two machines at the same time. Indeed, if by contradiction we assumed that there is a job that is processed in parallel, then there exist 20 k k ∈ 1, . . . , mn + 1, j ∈ J and i, d ∈ M such that Zijk = Zdj = 1. This implies that Pm k k i=1 Zij ≥ 2, contradicting that Z belongs to P . Finally, the makespan of the schedule is at most C, since the algorithm only assigns jobs P between time 0 and C mn+1 k=1 λk = C. With this the following holds. Corollary 3.2. To each feasible solution xij , C of [LL] corresponds a preemptive schedule of makespan C and vice-versa. Thus, to solve R|pmtn|Cmax it is enough to compute the optimal solution of [LL], and then turn it to a preemptive schedule using Algorithm: Nonparallel Assignment. Finally, we show that this algorithm runs in polynomial time. Lemma 3.3. Algorithm: Nonparallel Assignment runs in polynomial time. Proof. We just need to show that step (2) can be done in polynomial time. For this, consider any polytope P = {x ∈ RN |Ax ≤ b} for some matrix A ∈ M(R)K×N and vector b ∈ RK . For any z ∈ P , we need to show how to decompose z as a convex combinations of vertices of P . Clearly, it is enough to decompose z = λZ + (1 − λ)z ′ , where λ ∈ [0, 1], Z is a vertex of P , and z ′ belong to some proper face P ′ of P . Indeed, if this can be done, we can then interate the argument over z ′ ∈ P ′ . This procedure will finish after N steps since the dimension of the polytope is decreased after each iteration. For this, consider z ∈ P . Find any vertex Z ∈ P , which can be done for example, by minimizing a linear function over the polytope P . We define z ′ by projecting z into the frontier of P . For this, let γ̂ = max {γ ≥ 1|Z + γ(z − Z) ∈ P }. In other words, if Ai denotes the i-th row of A, then γ̂ = min i=1,...,K bi − Ai · Z Ai · (z − Z) 6= 0 . Ai · (z − Z) With this, define z ′ := Z + γ̂(z −Z) ∈ P , implying that z = z ′ /γ̂ +Z(γ̂ −1)/γ̂. Thus, defining λ := 1/γ̂ ≤ 1 we get that z = λz ′ + (1 − λ)Z. Finally, note that z ′ belongs to a proper face of P . For this, it is enough to show that there is i∗ ∈ {1, . . . , K} such that Ai∗ · z ′ = bi∗ and Ai∗ · Z < bi∗ , which is clear from the choice of γ̂. Then, the face P ′ ∋ z ′ equals, P ′ := x ∈ RN A′ x ≤ b′ , 21 where A′ := A −Ai∗ ! and b′ := b −bi∗ ! . Note that the complexity of this algorithm is O((V + KN ) · N ), where V denotes the complexity of finding a vertex of P . In general, V can be done using the ellipsoid method, but in our particular problem it can be done much faster. Indeed, finding a vertex of a face in a matching polyhedron of a bipartite graph can be formulated as finding a matching over a bipartite graph, with the extra restriction that a given subset of vertices must be covered. Clearly this can be done by finding a maximum weight matching, which can be solved in O(n2 · m) ([34], Ch. 17.2), where n is the number of jobs and m the number of machines. Finally, since N = nm, the time complexity of the algorithm is O(n3 · m2 ). 3.2 A new rounding technique for R||Cmax In 1990, Lenstra, Shmoys and Tardos [27] gave a 2-approximation algorithm for the problem of minimizing the makespan on unrelated machines. For this, they noticed that if the value of the optimal makespan Cmax was known, they could formulate the problem as finding an integer feasible solution of a polytope. This polytope, that uses assignment variables of jobs to machines xij , is defined by the following set of linear inequalities. [LST] X xij = 1 for all j ∈ J, pij xij ≤ C for all i ∈ M, xij = 0 if pij > C, xij ≥ 0 for all i, j. i∈M X j∈J (3.8) Then, if we can find a feasible integral solution of this polytope in polynomial time, then we could solve R||Cmax by doing binary search on C to estimate Cmax . To obtain a 2-approximation algorithm, Lenstra et. al relaxed the integrality contrains of this feasibility problem, and proposed a rounding technique that turns any vertex of [LST] to a feasible schedule with makespan at most 2C. Later, Shmoys and Tardos [37] refined this rounding so they could turn any feasible solution of [LST] (not just a vertex) into a 22 schedule, without increasing the makespan in more than a factor of 2. Shmoys and Tardos used this new technique to design an approximation algorithms for the generalized assignment problem. The main technical difficulty to generalize Lenstra et al.’s rounding technique to our more P general problem R|| wL CL , relays on the fact that the value of the optimal makespan must be previously known or guessed by a binary search procedure, thing that is not clear how P to do in R|| wL CL . To overcome this, we further relax [LST] by replacing Equation (3.8) with Equation (3.3), and thus removing the nonlinearity on the value of the makespan. With this, we have removed the necessity to estimate Cmax by a binary search procedure, since we can just minimize the makespan C over a polytope. In other words, we can use the solution of the linear program [LL] as a lower bound of our problem. In what follows we show how to round any fractional solution of [LL] to an integral one, such that the makespan increases in at most a factor of 4. By Corollary 3.2, this is equivalent to turning any preemptive schedule to a nonpreemptive one, such that the makespan is increase in no more than a factor of 4. Let x and C be a feasible solution of [LL]. The rounding proceeds in two steps: First, we eliminate fractional variables whose corresponding processing time is too large; Then, we use the rounding technique of Shmoys and Tardos [37] as a subroutine. This result is subsumed in the next theorem. Theorem 3.4 (Shmoys and Tardos [37]). Given a nonnegative fractional solution to the following system of equations: XX j∈J i∈M cij xij ≤ C, X (3.9) for all j ∈ J, xij = 1, i∈M (3.10) there exists an integral solution x̂ij ∈ {0, 1} satisfying (3.9),(3.10), and also, xij = 0 =⇒ x̂ij = 0 X X pij x̂ij ≤ pij xij + max{pij : xij > 0} j∈J j∈J for all i ∈ M, j ∈ J, for all i ∈ M. Furthermore, such integral solution can be found in polynomial time. 23 (3.11) To begin our rounding, we first define a modified solution x′ij as follows: x′ij =  0  xij Xj where Xj = P i:pij ≤2C if pij > 2C ∗ (3.12) else, xij for all j ∈ J. Note that, 1 − Xj = X i:pij >2C xij ≤ X i:pij >2C xij pij 1 < , 2C 2 where the last inequality comes from Equation (3.3). Thus Xj > 1/2, which implies that x′ij satisfies X j∈J x′ij ≤ 2xij for all j ∈ J, i ∈ M, x′ij ≤ 2C for all i ∈ M. Also, note that by construction the following is also satisfied. X x′ij = 1 for all j ∈ J, x′ij = 0 if pij > 2C. i∈M Then, we can apply Theorem 3.4 to x′ij (for cij = 0), to obtain a feasible integral solution x̂ij to [LL], such that for all i ∈ M , X j∈J x̂ij pij ≤ X j∈J x′ij pij + max{pij : xij > 0} ≤ 2C + 2C = 4C. Thus, the rounded solution is within a factor 4 of the fractional solution. 3.3 Power of preemption of R||Cmax We now show that the integrality gap of [LL] is at least 4. This, together with the rounding developed in the previous section, implies that the integrality gap of [LL] is exactly 4. As discussed in Section 2.2, this means that it is not possible to construct a rounding with a 24 factor better than 4, thus implying that the naive rounding developed on the previous section is best possible. Let us fix β ∈ [2, 4), and ε > 0 such that 1/ε ∈ N. We now construct an instance I = I(β, ε) such that its optimal nonpreemptive makespan is at most C(1 + ε), and that any nonpreemptive solution of I has makespan at least βC. The construction is done iteratively, maintaining at each iteration a preemptive schedule of makespan (1 + ε)C , and where the makespan of any nonpreemptive solution is increased. During the construction of the instance, we will interchangeable use the equivalence between feasible solutions of [LL] and preemptive schedules given by Corollary 3.2. 3.3.1 Base case We begin by constructing an instance I0 , which will later be our first iteration. To this end consider a set of 1/ε jobs J0 = {j(0; 1), j(0; 2), . . . , j(0; 1/ε)} and a set of 1/ε + 1 machines M0 = {i(1), i(0; 1), . . . , i(0; 1/ε)}. Every job j(0; ℓ) can only be processed in machine i(0; ℓ), where it takes βC units of time to process, and in machine i(1), where it takes a very short time. More precisely, for all ℓ = 1 . . . , 1/ε we define, pi(0;ℓ)j(0;ℓ) := βC, pi(1)j(0;ℓ) := εC β , β−1 The rest of the processing times are defined as infinite. Note that a feasible fractional assignment is given by setting xi(0;ℓ)j(0;ℓ) = 1/β and xi(1)j(0;ℓ) = f0 := (β − 1)/β and setting to zero all other variables. The makespan of this fractional solution is exactly (1 + ε)C. P Indeed, the load of each machine i ∈ M0 , j∈J0 xij pij , equals C. Also, the load associated to P each job j ∈ J0 , i∈M0 xij pij , equals C + εC. Furthermore, no nonpreemptive solution with makespan less than βC can have a job j(0; ℓ) processed in machine i(0; ℓ), and therefore all jobs must be processed in i(1). This yields a makespan of C/f0 = βC/(β − 1). Therefore, the makespan of any nonpreemptive solution is min{βC, C/f0 }. Note that if β is chosen as 2, the makespan of any nonpreemptive solution must be at least 2, and therefore the gap of the instance tends to 2 when ε tend to zero. 25 I0 i(0; 1) i(0; ℓ) xij = 1/β pij = βC i(0; 1/ε) j(0; ℓ) xij = (β−1) β β pij = εC (β−1) i(1) {z C | } Figure 3.1: Instance I0 and its fractional assignment. The values over the arrows xij and pij denote the fractional assignment and the processing time respectively. 3.3.2 Iterative procedure To increase the integrality gap we proceed iteratively as follows. Starting from instance I0 , which will be the base case, we show how to construct instance I1 . As we will show later, an analogous procedure can be used to construct instance In+1 from instance In . Begin by making 1/ε copies of instance I0 , I0l for l = 1, . . . , 1/ε, and denote the set of jobs and machines of I0l as J0l and M0l respectively. Also, denote as i(1; ℓ) the copy of machine i(1) belonging to M0l (see Figure 3.2). Consider a new job j(1) for which pi(1;ℓ)j(1) = C(β −β/(β −1)) for all ℓ = 1, . . . , 1/ε (and ∞ otherwise), and define xi(1;ℓ)j(1) = εC/pi(1;ℓ)j(1) . This way, the load of each machine i(1; ℓ) in the fractional solution is (1 + ε)C, and the load corresponding to job j(1) is exactly C. Nevertheless, depending on the value of β, job j(1) √ may not be completely assigned. A simple calculation shows that for β = (3+ 5)/2, job j(1) 26 T1 I01 i(1; 1) I0ℓ 1 0 0 1 0 1 1/ε I0 1 0 0 1 0 1 i(1; ℓ) xij = i(1; 1ε ) 1 0 0 1 0 1 ε β β− β−1 pij = C(β − β ) β−1 j(1) Figure 3.2: Instance T1 and its fractional assignment. The values over the arrows xij and pij denote the fractional assignment and the processing time respectively. is completely assigned in the fractional assignment. Furthermore, as justified before, in any nonpreemptive schedule of makespan less than βC, all jobs of instance I0l must be processed on machine i(1; ℓ). Since also job j(1) must be processed on some machine i(1; ℓ) then the P load of that machine must be j∈J ℓ pi(1;ℓ)j + pi(1;ℓ)j(1) = Cβ/(β − 1) + C(β − β/(β − 1)) = βC. 0 √ Then, the gap of the instance already constructed converges to β = (3 + 5)/2 when ε tend to 0, thus improving the gap of 2 shown before. √ On the other hand, for β > (3 + 5)/2 (as we would like) there will be some fraction of job j(1), 1/ε X (β − 1)2 − β xi(1;ℓ)j(1) = f1 := 1 − β(β − 1) − β ℓ=1 that must be processed elsewhere. To overcome this, we do as follows. Let us denote the S1/ε S1/ε instance consisting of jobs ℓ=1 J0l and machines ℓ=1 M0ℓ as T1 , and construct 1/ε copies of instance T1 , T1k for k = 1, . . . , 1/ε. Also, consider 1/ε copies of job j(1), and denote them by j(1; k) for k = 1, . . . , 1/ε (see Figure 3.3). As shown before, we can assign a fraction 1 − f1 of each job j(1; k) to machines of T1k . To assign the remaining fraction f1 , we add an extra machine i(2), with pi(2)j(1;ℓ) := εC/f1 (and ∞ for all other jobs), so that the fraction f1 of each job j(1; ℓ) takes exactly εC to process in i(2). Then, defining xi(2)j(1;ℓ) = f1 , the total load of each job j(1; ℓ) does not exceed (1 + ε)C, while the load of machine i(2) is exactly C. 27 Let us denote the instance we have constructed so far as I1 . Notice that I1 is analogous to I0 in the sense that both satisfy the following properties for n = 0, 1, (i) In any nonpreemptive solution of makespan less than βC, every job j(n; ℓ) must be processed on machine i(n + 1). Therefore the makespan of any nonpreemptive solution is at least min{βC, C/fn }. (ii) The makespan of the fractional solution constructed is (1 + ε)C. In particular the load of machine i(n + 1) is C, and therefore a fraction of a job which takes less than εC can still be processed on this machine without increasing the makespan. Furthermore, it is easy to show that C/f0 < C/f1 for β > 2, i.e. the makespan of any nonpreemptive solution increased from I0 to I1 , and thus the integrality gap of the instance also increased. In the following we generalize the ideas shown before, and describe the construction of an instance with integrality gap arbitrarily close to β, for any β ∈ [2, 4). Procedure I 1. Construct I0 , f0 , and i0 as in Section 3.3.1, and let n = 0. 2. While fn > 1/(β − 1), we construct instance I(n + 1) as follows. (a) Construct an instance Tn+1 consisting of 1/ε copies of instance In , that we denote as Inl , for ℓ = 1, . . . , 1/ε, where the copy of machine i(n) belonging to Inl is denoted by i(n; ℓ). k (b) Create 1/ε copies of Tn+1 , Tn+1 for k = 1, . . . , 1/ε. Denote the ℓ-th copy of instace k In belonging to instance Tn+1 as Inℓk , and the copy of machine i(n+1) that belongs to instance Inℓk as i(n + 1; ℓ, k). (c) Create 1/ε new jobs, j(n + 1; k), for k = 1, . . . , 1/ε, and let pi(n+1;ℓ,k)j(n+1;k) = C(β − 1/fn ) for all k, ℓ = 1, . . . , 1/ε (and ∞ for all other machines). We define the assignment variables for this new jobs as: xi(n+1;ℓ,k)j(n+1;k) := ε β − 1/fn 28 for all k, ℓ = 1, . . . , 1/ε. This way, the unassigned fraction of each job j(n + 1; k) equals fn+1 := 1 − = 1/ε X xi(n+1;ℓ,k)j(n+1;k) (3.13) ℓ=1 (β − 1)fn − 1 . βfn − 1 (3.14) (d) To assign the remaining fraction of jobs j(n + 1; k) for k = 1, . . . , 1/ε, we create a new machine i(n + 2), and define pi(n+2)j(n+1;k) = εC/fn+1 for all k = 1, . . . , 1/ε (and ∞ for all other jobs). With this we can let xi(n+2)j(n+1;k) = fn+1 , so that this way the load of each job j(n + 1; k) and machine i(n + 2) are (1 + ε)C and C respectively. (e) Call In+1 the instance constructed so far, and redefine n ← n + 1. Observe that the defined assignment guarantees that the optimal preemptive makespan for In+1 is at most (1 + ε)C. 3. If fn ≤ 1/(β − 1), that is, the first time the condition of step (2) is not satisfied, we do half an iteration as follows. (a) Make 1/ε copies of In , Inℓ for ℓ = 1, . . . , 1/ε, and call i(n+1; ℓ), the copy of machine i(n + 1) belonging to Inℓ . (b) Create a new job j(n+1), and define pi(n+1;ℓ)j(n+1) := C(β−1/fn ) and xi(n+1;ℓ)j(n+1) := ε. Notice that this way job j(n + 1) is completely processed in the preemptive solution, and the makespan of the preemptive solution is still (1 + ε)C, since the load of job j(n + 1) equals C(β − 1/fn ) ≤ C. (c) Return In+1 , the instance thus constructed. Lemma 3.5. If Procedure I finishes, then it returns an instance with a gap of at least β/(1 + ε). Proof. It is enough to show that if the procedure finishes then the makespan of any nonpreemptive solution is at least βC. We proceed by contradiction, assuming that instance In∗ returned by Procedure I has makespan strictly less than βC. Note that for the latter to hold any job j in In∗ has to be assigned to the last machine i added by Procedure I for which pij < ∞ (this is obvious for jobs in I0 , and follows inductively for jobs in In , n ≤ n∗ ). 29 In+1 1/ε Tn1 Tn 1,1/ε In1,1 1/ε,1 11 00 00 11 00 11 1/ε,1/ε In In 11 00 00 11 00 11 In 1 0 0 1 0 1 1 0 0 1 0 1 j(n + 1; 1/ε) j(n + 1; 1) xij = fn+1 xij = fn+1 1111111 0000000 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 i(n + 2) | {z C } | {z εC } Figure 3.3: Construction of instance In+1 (β). This implies that the load of all machines i(n∗ ; ℓ) (which where the last machines included) due to jobs different from j(n∗ ) equals C/fn∗ −1 . Indeed, for each job j that was fractionally assigned to any of these machines had xi(n∗ ;ℓ)j = fn∗ −1 , and i(n∗ ; ℓ) was the last machine for which pij was bounded. Thus, as all machines i(n∗ ; ℓ) had load C in the fractional assignment they will have load C/fn∗ −1 in the nonpreemptive solution. Furthermore, job j(n∗ ), for which pi(n∗ ,ℓ)j(n∗ ) = C(β − 1/fn∗ −1 ) must be processed in some machine i(n∗ , ℓ̄). Thus, the load of machine i(n∗ , ℓ̄) is C/fn∗ −1 + C(β − 1/fn∗ −1 ) = βC, which is a contradiction. To prove that the procedure in fact finishes, we first show a technical lemma. Lemma 3.6. For each β ∈ [2, 4), if fn > 1/β, then fn+1 ≤ fn . Proof. It follows from Equation (3.14) that, fn+1 − fn = −βfn2 + βfn − 1 . βfn − 1 Note that the numerator of the last expression is always positive since the square equation 30 −βx2 + βx − 1 has no real roots for 0 ≤ β < 4. The result follows since, by hypothesis, the denominator of this expression is always positive. Lemma 3.7. Procedure I finishes. Proof. We need to show that for every β ∈ [2, 4), there exist n∗ ∈ N such that fn∗ ≤ 1/(β −1). If this does not hold, then fn > 1/(β − 1) > 1/β for all n ∈ N. Then Lemma (3.6) implies that {fn }n∈N is a decreasing sequence. Therefore fn must converge to some real number L ≥ 1/(β − 1). Thus, Equation (3.14) implies that L= (β − 1)L − 1 , βL − 1 and therefore L is a root of equation −βx2 + βx − 1 which is a contradiction. We have proved the following theorem. Theorem 3.8. For each β ∈ [2, 4) and ε > 0, there is an instance I of R||Cmax , for which the optimal preemptive makespan is at most C(1 + ε), and the optimal nonpreemptive makespan is at least βC. Corollary 3.9. The integrality gap of [LL] is 4. 31 Chapter 4 Approximation algorithms for P minimizing wLCL on unrelated machines P In this chapter we present approximation algorithms for the general case R|rij | wL CL P and its preemptive version R|rij , pmtn| wL CL . Most of the techniques used for this are generalization of the methods shown in the previous chapter. 4.1 A (4 + ε)−approximation algorithm for P R|rij , pmtn| wLCL In the following we present a (4 + ε)-approximation algorithm for the preemptive version of P R|rij | wL CL . This means that for each ε > 0 we give a (4 + ε)-approximation algorithm, whose running time is polynomial on the size of the input and 1/ε. From now on we will assume without loss of generality that all processing time pij are integers greater or equal than 1. If this is not the case we can discard the cases when pij = 0 as trivial and scale the remaining processing times. The algorithm developed in this section is based on a time-indexed linear program, whose variables represent the fraction of each job that is processed at each (discreet) point in time on each machine. This kind of linear relaxation was originally introduced by Dyer and Wolsey P [13] for the problem 1|rj | j wj Cj , and was later extended by Schulz and Skutella [35], who P used it to obtain a (3/2 + ε)-approximation and a (2 + ε)-approximation for R|| wj Cj and 32 R|rj | P wj Cj respectively. Let us consider a time horizon T , large enough so it upper bounds the greatest completion P time of any reasonable schedule, for instance T = maxi∈M,k∈J {rik + j∈J pij }. We divide the time horizon into exponentially-growing time intervals, so that there is only polynomially many of them. For that, let ε be a fix parameter, and let q be the first integer such that (1 + ε)q−1 ≥ T . Then, we consider the intervals [0, 1], (1, (1 + ε)], ((1 + ε), (1 + ε)2 ], . . . , ((1 + ε)q−2 , (1 + ε)q−1 ]. To simplify the notation, let us define τ0 = 0, and τℓ = (1 + ε)ℓ−1 , for each ℓ = 1, . . . , q. With this, the ℓ-th interval corresponds to (τℓ−1 , τℓ ]. Given any preemptive schedule, let yjiℓ the fraction of job j that is processed on machine i in the ℓ-th interval. Then, pij yjiℓ is the amount of time that job j is processed on machine i in the ℓ-th interval. Consider the following linear program: [DW] min X wL CL L∈O q XX for all j ∈ J, (4.1) pij yjiℓ ≤ τℓ − τℓ−1 for all ℓ = 1, . . . , q and i ∈ M, (4.2) pij yjiℓ ≤ τℓ − τℓ−1 for all ℓ = 1, . . . , q and j ∈ J, (4.3) for all L ∈ O, j ∈ L, (4.4) yjiℓ = 0 for all j, i, ℓ : rij > τℓ , (4.5) yjiℓ ≥ 0 for all i, j, ℓ. (4.6) yjiℓ = 1 i∈M ℓ=1 X j∈J X i∈M X i∈M q yji1 + X ℓ=2 τℓ−1 yjiℓ ! ≤ CL It is easy to see that this is a relaxation of our problem. Indeed, Equation (4.1) assures that every job is completely processed. Equation (4.2) must hold since in each interval ℓ and machine i the total amount of time available is at most τℓ − τℓ−1 . Similarly, Equation (4.3) holds since no job can be simultaneously processed in two machines at the same time, and therefore for a fixed interval the total amount of time that can be used to process a job is at 33 most the length of the interval. To see that Equation (4.4) is valid notice that pij ≥ 1, and thus CL ≥ 1 for all L ∈ O. Also notice that CL ≥ τℓ−1 for all L, j ∈ L, i, ℓ such that yjiℓ > 0. Thus, the left hand side of Equation (4.4) is a convex combination of values smaller than CL . Finally, Equation (4.5) must hold since no part of a job can be assigned to an interval that finishes before the release date in any given machine. As usual in approximation algorithms based on linear relaxations, we first compute the optimal solution of [DW], and then transform it into a preemptive schedule whose cost is within a constant factor from the optimal cost of [DW]. To construct the schedule we do as follows. For any job j ∈ L, we truncate to zero all its variables that assign part of it to an interval considerably later than CL . Afterwards, we use Algorithm: Nonparallel Assignment (see Section 3.1) to construct a feasible schedule inside each interval, making sure that no job is processed in two machines at the same time. ∗ More precisely, let yjiℓ and CL∗ be the optimal solution of [DW]. Let j ∈ J, and L = arg min{CL∗ ′ ∈ O| L′ ∋ j}. For a given parameter β > 1 (which will be appropriately chosen later), we define:  0 if τℓ−1 > βCL∗ ′ yjiℓ = y∗ (4.7) ∗  jiℓ if τ ≤ βC , ℓ−1 L Yj where, Yj = X i∈M X ∗ yjiℓ . ∗ ℓ: τℓ−1 ≤β·CL The modified solution y ′ satisfies the following lemma. ′ ∗ Lemma 4.1. The modified solution yjiℓ , obtained by applying Equation (4.7) to yjiℓ satisfies, q XX ′ yjiℓ =1 for all j, (4.8) i∈M ℓ=1 X j∈J X i∈M ′ pij yjiℓ ≤ β (τℓ − τℓ−1 ) β−1 for all i, ℓ, (4.9) ′ pij yjiℓ ≤ β (τℓ − τℓ−1 ) β−1 for all j, ℓ, (4.10) for all L ∈ O, j ∈ L. (4.11) ′ yjiℓ = 0 if τℓ−1 > βCL∗ 34 ′ Proof. It is clear that yjiℓ satisfies (4.8) since: q X X i∈M ℓ=1 ′ yjiℓ ∗ X X yjiℓ 1 X = = Yj Yj i∈M ′ i∈M yjiℓ >0 X ∗ yjiℓ = 1. ∗ ℓ:τℓ−1 ≤β·CL Furthermore, to show that equations (4.9) and (4.10) holds, note that, 1 − Yj = ≤ ≤ X i∈M X X ∗ yjiℓ ∗ ℓ: τℓ−1 >β·CL X ∗ i∈M ℓ: τℓ−1 >β·CL ∗ CL 1 = . ∗ βCL β ∗ yjiℓ τℓ−1 βCL∗ The last inequality follows from Equation (4.4), and by noting that ℓ 6= 0 whenever τℓ−1 > β ∗ ′ yjiℓ . With this, equations (4.9) and (4.10) β · CL∗ . Then, Yj ≥ (β − 1)/β, and thus yjiℓ ≤ β−1 follow from equations (4.2) and (4.3). Finally, note that Equation (4.11) follows directly from the definition of y ′ . ′ Equation (4.11) in the previous lemma implies that the variables yjiℓ only assign jobs to ∗ intervals that finish before βCL in case j ∈ L. On the other hand, as shown by equations (4.9) and (4.10), the amount of load assign to each interval may not fit in the available time span τℓ − τℓ−1 . Thus, we will have to increase the size of every interval in a factor β/(β − 1). With the latter observations, we are ready to describe the algorithm. Algorithm: Greedy Preemptive LP 1. Solve [DW] to optimality and call the solution y ∗ and (CL∗ )L∈O . ′ 2. Define yjiℓ using Equation (4.7). 3. Construct a preemptive schedule S as follows. ′ (a) For each ℓ = 1, . . . , q, define xij = yjiℓ and C = (τℓ − τℓ−1 )β/(β − 1), and apply Algorithm: Nonparallel Assignment to this fractional solution. Call the preemptive schedule obtained Sℓ . (b) For each job j ∈ J that is processed by schedule Sℓ at time t ∈ [0, C] in machine i ∈ M , make schedule S process j in machine i at time t + τℓ−1 β/(β − 1). 35 Lemma 4.2. Algorithm: Greedy Preemptive LP constructs a feasible schedule where the completion time of each order L ∈ O is less than CL∗ (1 + ε)β 2 /(β − 1). ′ Proof. Note that equations (4.9) and (4.10) implies that for each ℓ = 1, . . . , q, xij = yjiℓ and C = (τℓ − τℓ−1 )β/(β − 1) satisfies equations (3.2) and (3.3). Then, by Lemma 3.1, the makespan of each schedule Sℓ is less than (τℓ − τℓ−1 )β/(β − 1), and thus the schedule Sℓ defines the schedule S in the disjoint amplified interval [τℓ−1 β/(β − 1), τℓ β/(β − 1)). Also, it follows from Lemma 3.1 and Equation (4.8) that the schedule S completely process every job. To bound the completion times of the orders, consider a fixed order L ∈ O and job j ∈ L. ′ Let ℓ∗ be the last interval for which yjiℓ > 0, for some machine i ∈ M . I.e., ′ ℓ∗ = max max{ℓ ∈ {1, . . . , q}|yjiℓ > 0}. i∈M Then, the completion time Cj is smaller than τℓ∗ β/(β − 1). To further bound Cj , we consider two cases. If ℓ∗ = 1 then, β β Cj ≤ ≤ CL∗ (1 + ε) , β−1 β−1 where the last inequality follows since CL∗ ≥ 1. On the other hand, if ℓ∗ > 1, Equation (4.11) implies that β β2 β ≤ τℓ∗ −1 (1 + ε) ≤ CL∗ (1 + ε) . Cj ≤ τℓ∗ β−1 β−1 β−1 Thus, by taking the maximum over all j ∈ L, the completion time of the order L is upper bounded by CL∗ (1 + ε)β 2 /(β − 1). Theorem 4.3. Algorithm: Greedy Preemptive LP is a (4 + ε)-approximation for β = 2. Proof. Let CL be the completion time of order L given by Algorithm: Greedy Preemptive LP. Taking β = 2 in the last lemma, which is the optimal choice, it follows that CL ≤ CL∗ (1 + ε)β 2 /(β − 1) = 4(1 + ε)CL∗ . Then, multiplying CL by its weight wL and adding over all L ∈ O, we conclude the the cost of the schedule constructed is no larger than 4(1 + ε) times the cost of the optimal solution to [DW], which is a lower bound on the cost of the optimal preemptive schedule. 36 4.2 A constant factor approximation for R|rij | P wLCL P We now give the first constant factor approximation for the general problem R|rij | wL CL . Our algorithm is based on an interval-index linear programming relaxation developed by Hall, Schulz, Shmoys, and Wein [21], and on the rounding technique developed in Section 3.2. Similarly as before, we consider a time horizon T , large enough so it upper bounds the greatest completion time of any reasonable schedule, for example T = maxi∈M,k∈J {rik + P j∈J pij }. We also divide the time horizon into exponentially-growing time intervals, so that there is only polynomially many. For that, let α > 1 be a parameter which will determine later, and let q be the first integer such as αq−1 ≥ T . With this, consider the intervals [1, 1], (1, α], (α, α2 ], . . . , (αq−2 , αq−1 ]. To simplify the notation, let us define τ0 = 1, and τℓ = αℓ−1 , for each ℓ = 1 . . . q. With this, the ℓ-th interval corresponds to (τℓ−1 , τℓ ]. Let us remark that, in this setting, the first interval starts and finish at 1, contrary to the definition on the previous section where the first interval started at 0 and finished at 1. To model the scheduling problem we consider the variables yjiℓ , indicating whether job j is finished in the machine i and in the interval ℓ. These variables allow us to write the following linear program based on that in [21], which is a relaxation of the scheduling problem even when integrality constraints are imposed. [HSSW] min X wL CL L∈O q XX yjiℓ = 1 for all j ∈ J (4.12) pij yjis ≤ τℓ for all i ∈ M and ℓ = 1, . . . , q (4.13) for all L ∈ O, j ∈ L (4.14) yjiℓ = 0 for all i, ℓ, j : pij + rij > τℓ (4.15) yjiℓ ≥ 0 for all i, l, j. (4.16) i∈M ℓ=1 ℓ X X s=1 j∈J q XX i∈M ℓ=1 τℓ−1 yjiℓ ≤ CL It is clear that [HSSW] is a relaxation of our problem. Indeed, for any nonpreemptive 37 schedule, define yjiℓ = 1 iff job j finishes processing on machine i at the ℓ-th interval. Then, Equation (4.12) holds since each job finishes in exactly one interval and one machine. The left hand side of (4.13) corresponds to the total load processed in machine i in the interval [0, τℓ ], and therefore the inequality is valid. The double sum in inequality (4.14) corresponds exactly to τℓ−1 , where ℓ is the interval where job j finishes, so that is at most Cj , and therefore it is upper bounded by CL if j ∈ L. The rule (4.15) imposes that some variables must be set to zero before the LP is solved. This is valid since if pij + rij > τℓ then the job j will not be able to finish before τℓ in machine i, and therefore yjiℓ will be zero. ∗ Let (yjiℓ )jiℓ and (CL∗ )L be an optimal solution to [HSSW]. To obtain a feasible schedule we need to round such solution into an integral one. For the special case where all orders are singletons (as in Hall et al’s [21] situation), (4.14) becomes an equality, so that one can directly use Theorem 3.4, regarding each machine-interval pair of our problem as one machine in the algorithm, to round a fractional solution to an integral solution of smaller total cost. When doing this the righthand side of equation (4.13) is increased to τℓ + max{pij : yjiℓ > 0} ≤ 2τℓ , where the last inequality follows from (4.15). This can be used to derive a constant factor approximation algorithm for the problem. In our setting however, it is not possible to apply the theorem directly, due to the nonlinearity of the objective function. We thus take a detour ∗ in the same manner as in Section 4.1: we round down to zero all variables yjiℓ for which τℓ−1 ∗ is considerable bigger than a certain parameter β times CL , for L = argmin{CL′ |L′ ∋ j} (and ′ we will optimize over β later on). For that we define the variables yjiℓ using Equation (4.7). With this, the next lemma follows from similar calculations as Lemma 4.1. ′ Lemma 4.4. The modified solution yjiℓ ≥ 0 satisfies: q XX ′ yjiℓ =1 for all j ∈ J (4.17) for all i ∈ M (4.18) if pij + rij > τℓ or τℓ−1 > βCL∗ , ∀i, j, ℓ, L : j ∈ L (4.19) i∈M ℓ=1 ℓ X X s=1 j∈J ′ pij yjis ≤ β τℓ β−1 ′ yjiℓ =0 With the previous lemma on hand we are in position to apply Theorem 3.4. To do this we regard each interval-machine pair of our problem as one machine of the algorithm. In other words, defining the set of machines M ′ = M × {1, . . . , q} and xhj = yjh1 h2 for each h = (h1 , h2 ) ∈ M ′ , we round xhj to an integral solution x̂hj := ŷjh1 h2 ∈ {0, 1} satisfying 38 equations (4.17), (4.19) and X j∈J ŷjiℓ pij ≤ X j∈J ′ ′ yjiℓ pij + max{pij : yjiℓ > 0, j ∈ J} ≤ X ′ yjiℓ pij + τℓ , (4.20) j∈J where the first inequality follows from (3.11) and the second from (4.19). P We are now ready to give the algorithm for R|rij | wL CL . Algorithm: Greedy-LP ∗ (1) Solve [HSSW] obtaining an optimal solution (yjiℓ ) and (CL∗ )L . ′ (2) Modify the solution according to (4.7) to obtain (yjiℓ ) satisfying (4.17), (4.18), and (4.19). ′ (3) Round (yjiℓ ) using Theorem 3.4 to obtain an integral solution (ŷjiℓ ) as above. S (4) Let Jil = {j ∈ J : ŷjiℓ = 1}. Greedily schedule in each machine i, all jobs in qℓ=1 Jil , starting from those in Ji1 until we reach Jiq (with an arbitrary order inside each set Jil ), respecting the release dates. To break down the analysis let us first show that Greedy-LP is a constant factor approximation for the case in which all release dates are zero. P Theorem 4.5. Algorithm Greedy-LP is a (27/2)-approximation for R|| wL CL . Proof. Let us fix a machine i and take a job j ∈ L such that ŷjiℓ = 1, so that j ∈ Jil . Clearly, Cj , the completion time of job j in algorithm Greedy-LP, is at most the total processing S time of jobs in ℓk=1 Jik . Then, Cj ≤ ≤ ℓ X X pik ŷkis s=1 k∈J ℓ X X s=1 ′ pik ykis + τs k∈J l ! X β τℓ + τs β−1 s=1 βα α2 ≤ + τℓ−1 β−1 α−1 α β CL∗ . + ≤ βα β−1 α−1 ≤ 39 The second inequality follows from (4.20), the third from (4.18), and the fourth follows from the definition of τk . The last inequality follows since, by condition (3.11), ŷjiℓ = 1 implies ′ yjiℓ > 0, so that by (4.19) we have τℓ−1 ≤ βCL∗ . Optimizing over the factor approximation, the best possible factor given by this method is attained at α = β = 3/2, and therefore we conclude that Cj ≤ 27/2 · CL∗ . As this latter fact holds for all j ∈ L, we conclude that CL = maxj∈L Cj ≤ 27/2 · CL∗ . The claimed approximation factor follows directly by multiplying this inequality by wL , adding over all L ∈ O and using the fact that [HSSW] is a lower bound on the optimal schedule. Theorem 4.6. Algorithm Greedy-LP is a (27/2)-approximation for R|rij | P wL CL . Proof. Similarly to the proof of the previous theorem, we will show that the solution given by Algorithm Greedy-LP satisfies that Cj ≤ 27/2 · CL∗ , even in the presence of release dates. P P 1 ′ + ℓs=1 p y + τ . We will see that it is possible to schedule Let us define τ ℓ := α−1 ik s iks k∈J every set of jobs Jil in machine i between time τ ℓ−1 and time τ ℓ (with and arbitrary order inside each interval), respecting all release dates. Indeed, assuming that 1 < α ≤ 2, it follows from (3.11) and (4.19) that for every j ∈ Jil , ℓ−1 rij ≤ τℓ ≤ X τℓ − 1 1 1 + = + τk ≤ τ ℓ−1 . α−1 α−1 α − 1 k=1 Thus job j is available for processing in machine i at time τ ℓ−1 . On the other hand, note that P ′ τ ℓ − τ ℓ−1 = j∈J pij yjiℓ + τℓ , so it follows from (4.20) that all jobs in Jil fit inside (τ ℓ−1 , τ ℓ ]. We conclude that in the schedule constructed by Greedy-LP any job j ∈ Jil is processed before τ ℓ . Therefore, as in the previous theorem, 1 Cj ≤ + α−1 ℓ X X s=1 k∈J ′ pik yiks + τk ! ≤ α2 βα + β−1 α−1 Again, choosing α = β = 3/2 we obtain Cj ≤ 27/2 · CL∗ . 40 τℓ−1 ≤ βα β α + β−1 α−1 CL∗ . Chapter 5 A PTAS for minimizing parallel machines P wLCL on P In this chapter we design a PTAS for P |part| wL CL with some additional constraint. We assume that there are either a constant number of machines, a constant number of jobs per order or a constant number of orders. First, we will describe the case where the number of jobs of each order is bounded by a constant K, and then we will justify that this implies the existence of PTASs for the other cases. The results in this chapter closely follows the P PTAS developed for P |rj | wj Cj in [1]. However, it is technically more involved mainly for three reasons: Firstly, it is crucial to show that there is a near-optimal schedule such that the time-span of every order is small, and, furthermore, the precise localization of orders is significantly more complicated; Also, as we shall see later, it is important that all the nearoptimal solutions that we construct satisfy the properties of Lemma 5.1; Finally, we need to be slightly more careful in the final placing of jobs. As usual in the design of approximation schemes, the general idea is to add structure to the solution by modifying the instance in a way such that the cost of the optimal solution is not worsen in more than a (1 + ε) factor. Also, by applying several modifications to the optimal solution of this new instance we will prove that there exists a near-optimal solution that satisfies several extra properties. The structure given by this properties allow us to find this solution by enumeration or dynamic programming. As each one of the modifications that we are going to apply to the optimal solution only generates a loss of at most a factor of (1 + ε) in the cost, and we will apply only a constant number of them, we will end up with a solution that has cost within a factor of (1+ε)O(1) to the cost of the optimal scheduling. Then, 41 choosing a small enough ε we can approximate the solution up to any factor. To simplify the notation, in what follows we assume that all processing times are positive integers and that 1/ε is also an integer. Also, in what follows all the logarithms will be taken base (1 + ε), P unless it is explicitly stated. Besides, we will denote as p(L) = j∈L pj the total processing time of a set L ∈ J. As in the previous chapter, we will partition the time horizon in exponentially increasing intervals. For every integer t we will denote as It the interval [(1 + ε)t , (1 + ε)t+1 ), and we denote the size of the interval as |It |, then |It | = ε(1 + ε)t . Besides rounding and partitioning, a basic procedure we will use repeatedly is that of stretching, which consist in stretching the time axis by a factor of (1 + ε). Of course, this only worsen the solution in a factor of (1 + ε). The two basic stretching procedures we use are: 1. Stretch Completion Times: This procedure consists in delaying all jobs, so that the completion time of a job j becomes Cj′ = (1 + ε)Cj in the new schedule. This will increase the cost of the solution in exactly a factor of (1 + ε). This procedure creates a gap of idle time of size εpj before each job j. Indeed, if k was the job being processed just before j, then Cj′ − Ck′ = Cj − Ck + ε(Cj − Ck ) ≥ Cj − Ck + εpj . 2. Stretch Intervals: The objective of this procedure is to create idle time in every interval, except for those having a job that completely crosses them. As before, it consists on shifting jobs to the following interval. More precisely, if job j finishes in It and occupies dj time units in It , we will move j to It+1 by pushing it exactly |It | time units, so it also uses exactly dj time units in It+1 . Then, the completion time of the new schedule will be at most (1 + ε)Cj , and therefore the overall cost is increased by at most a factor (1 + ε). Note that, if j started processing in It and was being processed in It for dj time units, after doing the shifting will be processed in It+1 for at most dj time units. Since It+1 has ε|It | = ε2 (1 + ε)t more time units than It , at least that much idle time will be created in It+1 . Also, we can assume that this idle time is consecutive in each interval. Indeed, this can be accomplished by moving to the left as much as possible all jobs that are scheduled completely inside an interval. After applying the procedures we will also shift the index of the intervals to the right, so if a job was being processed in interval It , it will be still processed in It in the new schedule. 42 This will give the illusion that we have stretched time or intervals in a (1 + ε) factor. Before giving a general description of the algorithm we show that there exists an (1+ε)-approximate schedule where no order crosses more than O(1) intervals. For this, we first show the following basic property which is stated for the more general case of unrelated machines. Lemma 5.1. For any instance of R|part| P wL CL there exist an optimal schedule such that: 1. For any order L ∈ O and for any machine i = 1, . . . , m, all jobs in L assigned to i are processed consecutively. 2. The sequence in which the orders are arranged inside each machine is independent of the machine. Proof. Let us consider an optimal schedule of the problem. For a given order-machine pair L and i, let j ∗ be the last job in L that is assigned to i. It is easy to see that any job in L that is processed in i before j ∗ can be processed just before j ∗ without increasing the completion time of any order. With this we conclude the first property of the lemma. For the rest of the proof will assume that the optimum solution satisfies this property. For the second property, note that inside each machine the orders can be arrange following their completion times CL1 ≤ CL2 ≤ . . . ≤ CLk without increasing the cost of the solution. If this does not hold, then there exist two orders L, L′ such that CL ≤ CL′ and in some machine i the jobs of L′ are processed just before the ones in L. If this is the case then it is clear than interchanging these two sets of jobs in machine i does not increase the cost of the solution, since jobs in L will decrease their completion time while jobs in L′ will still complete before CL . Therefore, due to the fact that CL ≤ CL′ , the completion time of L′ in this new schedule will remain the same. The procedure can be iterated until the second property in the lemma is satisfied. Lemma 5.2. Let s := ⌈log(1 + 1/ε)⌉, then there exists an (1 + ε)-approximate schedule in which every order is fully processed in s + 1 consecutive intervals. Proof. Let us consider an optimal schedule as in Lemma 5.1 and apply Stretch Completion Times. Then we move all jobs to the right as much as possible without increasing the completion time of any order. Note that for any order L, each job j ∈ L increased its completion time by at least εCL . Indeed, if this is not the case let L be the last order (in terms of completion time) for which there exists j ∈ L that increased its completion time by less than εCL . Let i be the machine processing j. Lemma 5.1 implies that all 43 jobs processed in i after job j belong to orders that finish later than CL and thus they increase their completion time by at least εCL . As the completion time of order L was also increased by εCL , we conclude that job j could be moved to the right by εCL contradicting the assumption. This implies that after moving jobs to the right, the starting point of each order L ∈ O, SL , will be at least εCL , and therefore CL − SL ≤ SL /ε. Let Ix and Iy be the interval where L starts and finishes respectively, then (1+ε)y −(1+ε)x+1 ≤ CL −SL ≤ SL /ε ≤ (1/ε)(1+ε)x+1 , which implies that y − x − 1 ≤ log(1 + 1ε ) ≤ s. 5.1 Algorithm overview. In the following we describe the general idea of the PTAS. Let us divide the time horizon in blocks of s + 1 = ⌈log(1 + 1/ε)⌉ + 1 intervals, and denote as Bℓ the block [(1 + ε)ℓ(s+1) , (1 + ε)(ℓ+1)(s+1) ). Lemma 5.2 suggest to optimize over each block separately, and later put the pieces together to construct a global solution. Since there may be orders that cross from one block to the next, it will be necessary to perturb the “shape” of blocks. For that we introduce the concept of frontier. The outgoing frontier of block Bℓ is a vector that has m entries. Its i-th coordinate is a guess on the completion time of the last job scheduled in machine i among jobs that belong to orders that began processing in Bℓ (in Section 5.4 we will see that there is a concise description of frontier). On the other hand, the incoming frontier of a block is the outgoing frontier of the previous one. For a given block and incoming and outgoing frontiers, we will say that an order is scheduled inside block Bℓ if in each machine all jobs in that order begin processing after the incoming frontier, and finish processing before the outgoing frontier. Assume that we know how to compute a near-optimal solution for a given subset of orders V ⊆ O inside a block Bℓ , with fixed incoming and outgoing frontiers F ′ and F , respectively. Let W (ℓ, F ′ , F, V ) be the cost (sum of weighted completion times) of this solution. Let Fℓ be the set of possible outgoing frontiers of block Bℓ . Using dynamic programming, we can fill a table T (ℓ, F, U ) containing the cost of a near-optimal schedule for the subset of orders U ⊆ O in block Bℓ or before, respecting the outgoing frontier F of Bℓ . To compute this quantity, we can use the recursive formula: T (ℓ + 1, F, U ) = min F ′ ∈Fℓ ,V ⊆U {T (ℓ, F ′ , V ) + W (ℓ + 1, F ′ , F, U \ V )}. 44 Unfortunately, the table T is not of polynomial size, or even finite. Then, it will be necessary to reduce its size as done in [1]. Summarizing, the outline of algorithm is the following. Algorithm: PTAS-DP 1. Localization: In this step we will bound the time-span of the intervals in which each order may be processed. We give extra structure to the instance and define a release date rL for each order L, such that there exists a near-optimal solution where each order begins processing after rL and ends processing no later than a constant number of intervals after rL . More precisely, we prove that each order L is scheduled in the interval [rL , rL · (1 + ε)g(ε,K) ], for some function g that will be specified later. This plays a crucial role in the next step. 2. Polynomial Representation of Order’s Subsets: The goal of this step is to reduce the number of subset of orders needed to try in the dynamic programming. To do this, for all ℓ, we find a polynomial size set Θℓ ⊆ 2O of possible subsets of orders that are processed in Bℓ or before in some near-optimal schedule. 3. Polynomial Representation of Frontiers: In this step we reduce the number of frontiers we need to try in the dynamic programming. For all ℓ, we find Fbℓ ⊂ Fℓ a set of polynomial size such that for each block the outgoing frontier of a near-optimal schedule belongs to Fbℓ . 4. Dynamic Programming: For all ℓ, F ∈ Fbℓ+1 , U ∈ Θℓ compute: T (ℓ, F, U ) = min bℓ ,V F ′ ∈F ⊆U,V ∈Θℓ−1 {T (ℓ − 1, F ′ , V ) + W (ℓ, F ′ , F, U \ V )}. It is clear that it is not necessary to compute exactly W (ℓ, F ′ , F, U \ V ); a (1 + ε)approximation of this value, that moves the frontiers in at most a factor of (1 + ε), is enough. To compute this approximation we partition jobs into small and large. For large jobs we use enumeration and essentially try all possible schedules, while for small jobs we greedily schedule them using Smith’s rule. One of the main difficulties of this approach is that all the modifications applied to the optimal solution must conserve the properties given by Lemma 5.1. This is necessary to be able to describe the interaction between one block and the following by using only the frontier. In other words, if this is no true, it could happen that some jobs of an order that 45 begins processing in a block Bℓ are processed after a job of an order that begins processing in block Bℓ+1 . This would greatly increase the complexity of the algorithm, since this interaction would be need to be considered in the dynamic programming, which would become too large. This is also the main reason why our result does not directly generalizes to the case when we have release dates, since then Lemma 5.1 does not hold. In the sequel we will analyze each of the previous steps separately. 5.2 Localization Lemma 5.2 shows that each order is completely processed in at most a constant number s, of consecutive intervals. However, we do not know a priori when in time each order is processed. In what follows, we refine this result by explicitly finding a constant number of consecutive intervals in which each order is processed in a near-optimal schedule. This property will be helpful in Step 2 to guess the specific block in which each order will be processed. The localization will be done by introducing artificial release dates, i.e., for each order L we will give a point in time rL such that, loosing a factor of at most (1 + ε) in the cost, L starts processing after rL . Naturally, it is enough to consider release dates which are powers of (1 + ε). The release dates are chosen so that the total amount of processing time released at any point (1 + ε)t is (1 + ε)t O(m). This will be sufficient to show that in a (1 + ε)approximate schedule all orders finish processing before a constant number of intervals after they are released. The following definition will be useful in the description of the algorithm. Definition 5.3. A job j is said to be small with respect to a time instant T if pj ≤ ε3 T . Otherwise, we say that j is big. Also, an order L is said to be small with respect to a time instant T if p(L) ≤ ε2 T . Otherwise, we say that L is big. Algorithm: Localization P 1. Initialize rj := (1 + ε)⌊log εpj ⌋ , u := log(minj∈J rj ), v := ⌈log( j∈J pj )⌉, and for all L ∈ O, rL := maxj∈L rj (1 + ε)−s . Also let P := ⌈log(maxj∈J pj )⌉. 2. (i) For all orders L ∈ O sort jobs in L in nonincreasing order of their size. Then greedily assign jobs to groups until the total processing time of each group just surpasses ε2 rL . After this process, there may be at most one group of size smaller than ε2 rL . 46 (ii) If this smaller group is of total processing time smaller than ε3 rL we add it to the biggest group and otherwise we leave it as a group. After this process, we redefine jobs in L as the newly created groups, and define the release dates of the new jobs, rj := (1 + ε)⌊log εpj ⌋ . 3. For all j ∈ J round its processing time to the next power of 1+ε. I.e., pj := (1+ε)⌈log pj ⌉ . Recall that K = O(1) is the maximum number of jobs per order. For all L ∈ O define its order type T (L) ∈ {0, . . . , K}P , as a vector whose p-th component is the number of jobs in L with processing time equal to (1 + ε)p , i.e., T (L)p := |{j ∈ L : log pj = p}|. 4. For all t = u, . . . , v, (i) Define the set Ot := {L ∈ O : L is big with respect to (1 + ε)t and rL = (1 + ε)t }. (ii) For α ∈ {0, . . . , K}P let Otα be the set that contains the K(1 + ε)s+2 m/ε5 orders of largest weight in {L ∈ Ot : T (L) = α}. [ (iii) Define Qt := Otα . α∈{0,...,K}P (iv) For all L ∈ Ot \ Qt , redefine rL := (1 + ε)t+1 . 5. For all t = u, . . . , v, (i) Define St := {L ∈ O : L is small with respect to (1 + ε)t and rL = (1 + ε)t } and sort all orders L ∈ St in non-increasing order of wL /p(L). (ii) Define Rt as the set that constains the first orders in St such that their total processing time is in [mε(1 + ε)t , mε(1 + ε)t + mε3 (1 + ε)t ]. (iii) For all L ∈ St \ Rt , redefine rL := (1 + ε)t+1 . In Step (1) we begin by defining a release date rj for every job, i.e., a point in time where each job start processing after it in a (1 + ε)-approximate schedule. It is easy to see that this is valid, since applying Stretch Completion Times ensures that no job starts processing before εpj . Afterwards, we define the values u and v, that give lower and upper bounds for the index of time intervals in which jobs may be processed. In other words we can restrict to consider intervals It with t ∈ {u, u + 1, . . . , v}. Finally, for every order L ∈ O we initialize a release date rL := maxj∈L rj (1 + ε)−s . This is valid because for every order L at least one of its jobs begins processing after maxj∈J rj , and Lemma 5.2 assures that no order crosses more than s intervals. 47 Clearly, this initial definition of the release date is not enough to assure that at each time instant (1 + ε)t there will be no more than (1 + ε)t O(m) total processing time released. To amend this we will delay the release dates of orders that will not be able to start before the next integer power of (1 + ε). For that we first classify the orders by the size of its jobs. Note that between two orders that are indistinguishable except for their weight, i.e., if there is a one to one correspondence between the size of its jobs, then the jobs of the one with larger weight will always have priority over the jobs of the other order. Therefore, between a set of orders that are indistinguishable except for their weight (we say that this orders are of the same type) we can greedily process the ones with larger weight first. This is the key argument to justify the delaying of release dates. Nevertheless, to successfully apply this we need to bound the amount of types of orders. To do this we proceed in two steps. First we get rid of jobs that are too small. Then, we round the processing time of every job to bound the number of values a processing time could take. Step (2) gets rid of every small job with respect to the release date of its order. This is done by considering several small jobs as one bigger one. The procedure is justified by the following lemma. Lemma 5.4. There is a (1 + O(ε))-approximate solution to the scheduling problem, in which all jobs in each group defined in Step (2) of Algorithm: Localization are processed together in the same machine. Proof. Let us first consider the groups of jobs defined in Step (2.i), and let us consider a 1 + O(ε)-approximate schedule of the original instance. We will find another 1 + O(ε)approximate schedule in which all jobs inside one of the groups are processed together. Then, losing at most a factor of 1 + O(ε) in the cost, every group can be consider as a single larger job. Notice that the groups consisting of only one job need not be considered in the proof. The rest of the groups consist only of jobs smaller than ε2 rL , and therefore by construction their total processing time will be smaller than 2ε2 rL . Also, since all of this jobs have pj ≤ ε2 rL , we can consider a near-optimal schedule where none of them is processed in more than one interval. Indeed, using Stretch Intervals we create enough space to schedule all these crossing jobs completely inside the interval where they begin processing. Let us thus fix a group, and consider the machines and intervals in which the jobs that belong to it are being processed. Interpreting a machine-interval pair as a virtual machine, the group can be interpreted as a virtual job that is fractionally assigned to the virtual machines containing its jobs. Now we can apply Shmoys and Tardos’ theorem (Theorem 48 3.4) to round this fractional solution so that each virtual job is processed completely inside a virtual machine. The rounding guarantees that the total processing time assigned to each virtual machine is not increased in more than 2ε2 rL , since this is the largest a virtual job can be. By applying Stretch Intervals twice we create the extra space needed. Also, the completion time of the virtual job is increased in at most a factor 1 + ε, since the rounding only assigns a virtual job to an interval if some of its jobs where previously assigned to it. Therefore the completion time of no order is further increased. We conclude that we can consider each of these groups as one larger job. To finish the proof we must show that if the smallest job of an order has total processing time smaller than ε3 rL we can merge it with the biggest job. Indeed, if L was left with more than one job after merging jobs into groups, the biggest job has processing time at least ε2 rL . By applying Stretch Completion Times we create a gap of idle time of at least ε3 rL before it, leaving enough space to fit the smallest job in there. Remark that after this step we can guarantee that no big order L contains a small job with respect to rL . In Step (3) we first reduce the number of possible values a processing time can take by rounding them to powers of 1 + ε. It is easy to see that this does not increase the cost of the solution in more than a factor 1 + ε. Indeed, applying Stretch Completion Time leaves enough space so we can increase the size of every job j up to (1 + ε)pj , and (1 + ε)⌈log1+ε pj ⌉ ∈ [pj , (1 + ε)pj ]. In the second part of Step (3) we classify orders by saying how many jobs of each size they contain. Since we are assuming that every processing time is greater than one, and are powers of (1 + ε), there are only P := ⌈log(maxj∈J pj )⌉ possible values a processing time can take. In Step (4) we delay the release dates of big orders that do not have any chance of begin processed at their current release date. For that, we let Ot be the set of big orders released at (1 + ε)t . We further partition Ot by the type of the orders. As explained before, for each order type, the orders with largest weight will be processed first, and therefore we can delay the release date of the orders with shortest weight that do not fit in It . In the following we will show that for any type of big order α and for any t, at most K(1 + ε)s+2 m/ε5 = O(m) orders that belongs to Otα can be processed in It , and therefore the rest can be delayed. The next lemma help us to accomplish this. Lemma 5.5. After delaying at most ⌈log(K/ε3 ) + s + 1⌉ times the release date of an order, the order becomes small with respect to its new release date. 49 Proof. Let rL be the release date of an order L as it was initialized in Step (1). Note that the definition of rL and rj implies that pj ≤ rL (1 + ε)s+1 /ε, and since there are only K jobs per order then p(L) ≤ rL K(1 + ε)s+1 /ε. If the release date has been delayed at least ⌈log(K/ε3 ) + s + 1⌉ times, then p(L) ≤ (1 + ε)log(rL )+r K(1 + ε)s−r+1 /ε ≤ ε2 (1 + ε)log rL +r , and therefore L is a small order with respect to its new release date. Recall that for the original release dates, every job belonging to a big order satisfies pj ≥ ε3 rL . Therefore the last lemma implies that at any point in the algorithm, if L is a big order with respect to its current release date rL , then any job j belonging to L satisfies that pj ≥ ε6 rL /(K(1 + ε)s+2 ). Thus, at most m · |Ilog rL | · K(1 + ε)s+2 m K(1 + ε)s+2 = ε 6 rL ε5 orders of each type in each Otα can start before (1 + ε)t+1 . The rest can have their release date increased to (1 + ε)t+1 without further affecting the cost. With this, at each point in time (1 + ε)t , and for each type of big order, there will be only (1 + ε)t mK(1 + ε)2s+3 /ε6 = (1 + ε)t O(mK/ε7 ) = (1 + ε)t O(m) total processing time released at (1 + ε)t . To conclude that there will be in total (1 + ε)t O(m) processing time of big orders released at (1 + ε)t , is sufficient to show that there are only O(1) different types of big orders in Ot . Lemma 5.6. At any point in the algorithm and at any time index t, there are only a constant number, K O(log(K/ε)) , of different types of big orders released at (1 + ε)t . Proof. As shown before, every job j that belongs to an order L ∈ Ot satisfies ε6 (1 + ε)t (1 + ε)t (1 + ε)s+1 ≤ p ≤ , j K(1 + ε)s+2 ε where the second inequality follows since pj is smaller than (1+ε)s+1 /ε times the release date of L ∋ j. Thus, pj can only take ⌈2s + 3 − 7 log ε + log K⌉ = O(log(K/ε)) different values, and by definition of type there will not be more than (K + 1)O(log(K/ε)) different ones. Summarizing, we have proved the following. 50 Theorem 5.7. At the end of the algorithm, there will be f1 (ε, K)m(1 + ε)t total processing time corresponding to big orders (w.r.t (1 + ε)t ) released at (1 + ε)t , for every t ∈ {u, . . . , v}. Here the function f1 (ε, K) is given by K O(log(K/ε)) /ε7 = K O(log(K/ε)) . With this we have completely dealt with big orders, but in the process we have created several orders that are small with respect to their release dates. In Step (5) we deal with these newly created orders. As before, we must define release dates such that at any instant (1 + ε)t there are at most (1 + ε)t O(m) total processing time corresponding to small orders (w.r.t (1 + ε)t ) released at (1 + ε)t . As in the big orders case, we delay orders that can not begin processing until the following release date. The following lemma explains how this is possible. Lemma 5.8. Given a feasible schedule of big orders (w.r.t. its release date), loosing at most a factor of 1+O(ε), small orders (w.r.t. their release date) can be scheduled by a list scheduling procedure in nonincreasing order of wL /p(L). Proof. Let us consider a fixed schedule of big orders. Notice that all small orders can be considered as just one job. Indeed, applying the same argument as in Lemma 5.4, we can consider an order as a virtual job partially assigned to machine-interval pairs, and apply Shmoys and Tardos’ theorem (Theorem 3.4). Let us call the midpoint of a job j the value Mj = Cj − pj /2. Note that since we are only considering orders that are small with respect to their release date, a near-optimal schedule minimizing the sum of weighted midpoints is also near optimal for minimizing sum of weighted completion times. Indeed, this follows since in this case the starting time of a job is within a 1 + O(ε)-factor from its completion time. The last observation leads us to consider the problem of optimizing the sum of weighted midpoints in a single variable-speed machine. The speed of the machine at time s is given by v(s), where v(s) is the number of machines that are free at s in the schedule of big orders. This clearly lower bounds the cost of our original problem for the sum of weighted midpoints objective. Note that the definition of midpoint of job j in this setting should be R∞ Mj = 1/pj 0 Ij (s)v(s)sds, where Ij is the indicator function of job j in the schedule, i.e. Ij (s) equals one if j is being processed at instant s, and zero otherwise. In other words it is enough for our purpose to find an algorithm for minimizing sum of weighted midpoints in a single machine of variable speed, and then turn it to a solution on our original schedule of big orders increasing the cost in at most a (1 + O(ε))-factor. 51 Interestingly, finding the schedule minimizing the sum of weighted midpoints on one variable speed machine can be achieved by scheduling in nonincreasing weight to processing time ratio (known as Smith’s rule). Claim: Let J be a set of jobs with associated processing times pj and weights wj . Consider that we have a single machine with variable speed v(s) for s ∈ [0, ∞). Then scheduling jobs in nonincreasing order of wj /pj gives a solution minimizing the sum of weighted midpoints. To prove the claim, we proceed by contradiction. Let us consider an optimal solution for which there exists two jobs j and k, such that j is processed right before k and wk /pk > wj /pj . Let Mj and Mk be the midpoints of this jobs in this schedule. Observe that swapping this two jobs decreases the cost of the schedule. To see this, let Mj′ and Mk′ be the midpoints of job j and k respectively, in the schedule where k is processed before j. Noting that wk /pk > wj /pj , and pj Mj′ + pk Mk′ = pj Mj + pk Mk , the difference in the cost can be evaluated as wj Mj′ + wk Mk′ − wj Mj − wk Mk < wk wj (pj Mj′ + pk Mk′ ) − (pj Mj + pk Mk ) ≤ 0, pk pj which proves the claim. Finally, we show that if we Stretch Intervals on the schedule of big orders in the m machines, we can use Smith’s rule (list scheduling) over small orders, yielding a near-optimal schedule. Indeed, applying Stretch Intervals to a schedule introduces an extra ε2 (1 + ε)t idle time in every machine and interval It , as long as no job of a big order completely crosses it. Clearly this increases the processing capacity in the m machines enough to ensure that the load corresponding to small orders processed in any interval surpasses that processed in the same interval but in the single machine with variable speed. This implies that, for every small order L, the starting time in the parallel machine schedule Sj and the completion time in the single machine schedule CjS satisfies Sj ≤ (1 + ε)CjS . The result follows. Remark that the variable-speed single machine scheduling problem defined in the proof of the last lemma is N P-hard when the objective is to minimize sum of weighted completion times. This follows by a simple reduction from number partition to the restricted case in which the speed v(s) ∈ {0, 1}. We do not know whether there exists a PTAS for this problem, which would also be enough for the purpose of the proof. Lemma 5.8 implies that at each time index t we can order small orders by wL /p(L) and delay the release date of orders that do not fit inside It . After doing this, at most ε(1 + ε)t m + ε3 (1 + ε)t m ≤ ε(1 + ε)t+1 m processing time of small orders will be released at 52 (1 + ε)t . Putting together this fact with Theorem 5.7, we obtain the following result and its corresponding corollary. Theorem 5.9. At the end of the algorithm the following holds: for every time index t, there are (f1 (ε, K) + ε(1 + ε)) m(1 + ε)t total processing time released at (1 + ε)t . Corollary 5.10. Let g(ε, K) := ⌈log((f1 (ε, K) + ε(1 + ε))ε−2 ⌉ + s + 1. There exists an (1 + O(ε))-approximate schedule where every order L ∈ O is processed in between rL and rL (1 + ε)g(ε,K) . Proof. Let us consider any (1 + O(ε))-approximate schedule. Applying Stretch Intervals generates ε2 (1 + ε)t extra idle space for each interval-machine pair (It , i), and mε2 (1 + ε)t in total for each interval. For a fixed t, we move to the left orders released at (1 + ε)t that are completely scheduled after (1 + ε)t+g(ε,K)−s , by using the space corresponding to the interval starting at (1 + ε)t+g(ε,K)−s created by Stretch Intervals. The rest of the orders must start processing after (1 + ε)t+g(ε,K)−s , and since they do not cross more than s intervals they will finish before (1 + ε)t+g(ε,K) . In this way the cost is not increased in more than a factor (1 + ε), since after applying Stretch Intervals we only move jobs to the left. Also, the structure of the near-optimal solution given in Lemma 5.1 is preserved because orders that are moved can be processed consecutively. To conclude we must show that we can process all orders released at (1 + ε)t in the idle space corresponding to the interval starting at (1 + ε)t+g(ε,K)−s created by Stretch Intervals. For that is enough to notice that the total processing time released at (1 + ε)t is smaller than the extra idle space in the interval starting at (1 + ε)t+g(ε,K)−s−1 . Indeed, −2 )⌉ mε2 (1 + ε)t+g(ε,K)−s−1 = mε2 (1 + ε)t+⌈log((f1 (ε,K)+ε(1+ε))ε ≥ m(1 + ε)t (f1 (ε, K) + ε(1 + ε)). Finally, since for every sufficiently small ε every order released at t is small w.r.t. (1 + ε) , we can accommodate all its jobs in (1 + ε)t+g(ε,K)−s . Clearly this can be done simultaneously for every t = u, . . . , v. t+g(ε,K)−s−1 5.3 Polynomial Representation of Order’s Subsets The objective of this section is to find a collection Θℓ of sets of orders that are processed in block Bℓ or before in a near-optimal schedule. To do this we will give a collection Uℓ of sets of 53 orders that are processed in Bℓ+1 or later. Clearly, this also defines the sets in Θℓ by simply taking the complement of each set in Uℓ . Note that every set in Uℓ must contain all orders with release date larger than or equal to (1 + ε)(s+1)(ℓ+1) , so it enough to decide which are the orders that are going to be processed after Bℓ+1 among those released before (1 + ε)(s+1)(ℓ+1) . Also, by Corollary 5.10 no order finishes after g(ε, K) intervals after its release date. Then, to guess which orders are going to be processed after (1 + ε)(s+1)(ℓ+1) we only consider orders with release dates between (1 + ε)(s+1)(ℓ+1)−g(ε,K) and (1 + ε)(s+1)(ℓ+1) . For the sake of clarity, we define the sets Θℓ using the following algorithm. Algorithm: Polynomial Representation of Order’s Subset P 1. Let rj := (1+ε)⌊log εpj ⌋ , u := log(minj∈J rj ), v := ⌈log( j∈J pj )⌉, P := ⌈log(maxj∈J pj )⌉ u v be as in Algorithm: Localization. Define ũ = ⌊ s+1 ⌋ and ṽ = ⌊ s+1 ⌋ as a lower and upper bound for the indices of blocks. 2. For all t = u, . . . , v: (i) Consider At ⊆ {0, . . . , K}P , the set of possible types of big orders released at (1 + ε)t . Note that by Lemma 5.6, |At | = K O(log K/ε) = O(1). (ii) For every α ∈ At , consider the set Otα defined in Algorithm: Localization and order its elements by non-increasing order of weight. Define the collection of nested sets of heavier orders in Otα as: Wtα := {W ⊆ Otα |W contains the k orders with largest wL in Otα for some k = 0, . . . , n}. (iii) Consider the set Rt defined in Algorithm Localization and order its elements by nonincreasing order of wL /p(L). Define the collection of nested sets of orders having larger weight to processing time ratio in Rt as: Vt := {V ⊆ Rt |V contains the k orders of largest wL /p(L) in Rt for some k = 0 . . . , n} . (iv) Define the collection of possible sets of orders released at (1 + ε)t , formed all possible unions of sets in Vt and Wtα , for all α ∈ At , as: Nt := ( V ∪ [ α∈At 54 Wα : V ∈ Vt and Wα ∈ Wtα ) . 3. For all ℓ = ũ, . . . , ṽ, define Uℓ as the collection of all sets of the form (s+1)(ℓ+1) L ∈ O : L is released after time (1 + ε) g(ε,K) ∪ [ Nt , t=0 where Nt ⊆ N(s+1)(ℓ+1)−g(ε,K)+t for t = 0, . . . , g(ε, K). 4. For all ℓ = ũ, . . . , ṽ, let Θℓ be the collection containing the complement of every set in Uℓ . In Step (2) of the algorithm we construct a collection Nt for every time index t, that contains the sets of orders released at (1 + ε)t that could be processed after an arbitrary time instant in a near-optimal schedule. Let us consider only orders released at (1 + ε)t . As described in Step (2.ii), we first construct the collection of possible sets of big order (w.r.t (1 + ε)t ) of a given type α. Since for any two orders with the same type the one with largest weight is scheduled first, then the orders of type α that are processed the latest are those with the smallest weight. Then we can restrict to consider at most n sets for every type of orders. Analogously, by Lemma 5.8, small orders (w.r.t (1 + ε)t ) can be scheduled by non-increasing order of wL /p(L). Finally, in Step (2.iv) we construct Nt as the collection containing all possible combinations of sets formed as the union of a set of every type of big order, and a set of small orders. Since we are considering only n sets of orders for each type O(log K/ε) of big orders and there are only K O(log K/ε) different types, then |Nt | = nK . In Step (3) we define the collections of sets Uℓ by combining all possible sets in Nt for t corresponding to orders that can be processed in Bℓ+1 or later, i.e. for t = (s + 1)(ℓ + 1) − O(log K/ε) /ε2 g(ε, K), . . . , v. Clearly |Uℓ | ≤ |Nt |g(ε,K) = nK , which is polynomial in n. Finally, in Step (4) we construct Θℓ by taking the complement of every set in Uℓ . 5.4 Polynomial Representation of Frontiers We now prove that, for a given block, it is enough to consider only a polynomial number of outgoing frontiers. This is necessary to apply the described dynamic programming algorithm. Let us consider a fix block Bℓ . Recall that an outgoing frontier can be seen as a vector whose i-th component denotes the value of the frontier corresponding to machine i. In is important to observe that we can 55 restrict ourselves to only consider frontiers whose entries belong to Γℓ := (1 + ε)(s+1)(ℓ+1) + kε3 (1 + ε)(s+1)(ℓ+1)−1 : k = 0, . . . , ⌈(1 + ε)[(1 + ε)s + 1]/ε3 ⌉ . Whenever the frontier corresponds to an outgoing frontier of block Bℓ . Indeed, notice first that if for some schedule Fℓ−1 and Fℓ are outgoing frontiers of blocks Bℓ−1 and Bℓ respectively, then there is a schedule such that for every machine either the difference between the two frontiers is greater than ε2 (1 + ε)(s+1)(ℓ+1)−1 , or the frontier Fℓ coincides with the beginning of the block Bℓ+1 . Otherwise, the set of jobs processed between Fℓ−1 and Fℓ in machine i has total processing time smaller than ε2 (1 + ε)(s+1)(ℓ+1)−1 , thus Stretch Intervals will create enough time in I(s+1)(ℓ+1)−1 to fit all these jobs. Then Fℓ can be considered to coincide with the beginning of Bℓ+1 . In this latter case it is clear that by taking k = 0 the corresponding entry of the outgoing frontier belongs to Γℓ . On the other hand, in the former case we have that the difference between the frontiers is greater than ε2 (1 + ε)(s+1)(ℓ+1)−1 , and by Stretch Completion Times we can generate at least ε3 (1 + ε)(s+1)(ℓ+1)−1 total idle time in between the frontiers. By moving all jobs in between frontiers as much as possible to the left (without modifying left frontier), all created idle time can be assumed to be next to Fℓ . Then we can move Fℓ to the left in order to bring its corresponding component to an element of Γℓ . Clearly all this procedure increases the cost at most in a factor 1 + O(ε). With the previous observations we can restrict ourselves to only look at |Γℓ |m different outgoing frontiers for each block. However, this is not of polynomial size. To overcome this difficulty, we consider a more concise representation of outgoing frontiers. Concise description of frontier: A concise outgoing frontier F̂ℓ of block Bℓ is a vector of |Γℓ | entries, in which the k-th component is the number of machines that have (1 + ε)((s+1)(ℓ+1)) + (k − 1)ε3 (1 + ε)(s+1)(ℓ+1)−1 as a value of frontier. Then the set of all possible outgoing frontier is given by Fbℓ := {0, . . . , m}|Γℓ | . This description of concise outgoing frontier is not enough to represent any possible block Bℓ lying in between Fℓ−1 and Fℓ . Nevertheless, by doing some extra enumeration the representation is achieved. Since all machines are equal, a block Bℓ with incoming and outgoing frontiers Fℓ−1 and Fℓ , can be fully described in the following way: For each pair of elements in t1 ∈ Γℓ−1 and t2 ∈ Γℓ we need |{i = 1, . . . , m : (Fℓ−1 )i = t1 and (Fℓ )i = t2 }| , 56 i.e., the number machines available from t1 to t2 in the block. Clearly for each pair F̂ℓ−1 and F̂ℓ we can enumerate all possible such descriptions coinciding with F̂ℓ−1 and F̂ℓ . Indeed, for each element z ∈ {0, . . . , m}|Γℓ−1 |×|Γℓ | , z coincides with F̂ℓ−1 and F̂ℓ if and only if X zij = (F̂ℓ−1 )i for all i ∈ Γℓ−1 , zij = (F̂ℓ−1 )j for all j ∈ Γℓ . j∈Γℓ X i∈Γℓ−1 8 This requires to check m|Γℓ−1 |·|Γℓ | = mO(1/ε ) possible block descriptions. Also, the number of 4 concise frontiers is bounded by |Fbℓ | ≤ m|Γℓ | = mO(1/ε ) . 5.5 A PTAS for a specific block To conclude our algorithm we need to compute the table W (ℓ, F ′ , F, U ) as a subroutine in Algorithm: PTAS-DP. I.e., for a given block Bℓ , concise incoming and outgoing frontiers F ′ and F , and subset of orders U , we need to find a (1 + ε)-approximate solution of the schedule minimizing the sum of weighted completion time of jobs in U inside Bℓ . Note that it is possible to move the frontiers in a factor (1 + ε) without increasing the cost of a global solution by more than a factor (1 + ε). In the sequel, we consider orders and jobs as big or small with respect to the beginning of block Bℓ . In other words, a job will be small if its processing time is smaller than ε3 (1+ε)(s+1)ℓ , and big otherwise. Additionally, an order will be small if its total processing time is smaller than ε2 (1 + ε)(s+1)ℓ , and big otherwise. Following the ideas of the previous sections, we enumerate over schedules of big orders, and apply Lemma 5.8 to greedily assign small orders using Smith’s rule. Algorithm: PTAS for a block 1. Redefine the release dates rL := (1 + ε)(s+1)ℓ for all L ∈ U . 2. Apply Step (2) of Algorithm: Localization, and round the processing time of the new jobs to the next power of (1 + ε). 3. Let Qℓ := {p ∈ R : log(p) ∈ N ∩ [3 log(ε) + (s + 1)ℓ, (s + 1)(ℓ + 2)]} be the set of possible size a job that belongs to a big order L ∈ U could have. Also, define the set of 57 possible types of big orders in U as Cl ⊆ {0, . . . , K}P . Additionally set Ωℓ := (1 + ε)(s+1)ℓ 1 + kε4 : k = 0, . . . , ((1 + ε)2(s+1) − 1)/ε4 = {ω1 , . . . , ω|Ωℓ | }. We will see that we can restrict ourselves to schedules where every big job only start processing in an instant that belongs to Ωℓ . As we are rounding processing times and starting times we will require some extra room. Therefore, we redefine: Ωℓ := (1 + ε)5 · Ωℓ and Γℓ := (1 + ε)5 · Γℓ . Here, multiplying a scalar times a set means that every element gets multiplied. 4. Define a single machine configuration as a vector S with |Ωl | + 2 entries. For k = 1, . . . , |Ωℓ |, its k-th entry contains a pair (qk , ck ) ∈ (Qℓ ∪ {0}) × (Cℓ ∪ {∅}), where qk can be interpreted as the processing time of a job that starts processing at ωk , and ck as the type of order that contains the job. To represent that no job starts processing at a time instant ωk we set Sk = (0, ∅). The last two entries of S, S|Ωℓ |+1 ∈ Γℓ−1 and S|Ωℓ |+2 ∈ Γℓ , represent the values of the incoming and outgoing frontier of block Bℓ in that machine. Is it sufficient to consider vectors S where there is enough space to schedule jobs of the sizes described in S without overlapping, and respecting the corresponding incoming an outgoing frontier. In other words, a valid S must satisfy: (a) For each k = 1, . . . , |Ωℓ |, if i > k and ωi < ωk + qk then Si = (0, ∅). (b) For all k = 1, . . . , |Ωℓ |, if ωk < S|Ωℓ |+1 then Sk = (0, ∅). (c) For all k = 1, . . . , |Ωℓ |, if ωk + qk > t|Ωℓ |+2 then Sk = (0, ∅). Thus let the set S ⊆ ((Qℓ ∪ {0}) × (Cℓ ∪ {∅}))|Ωℓ | × Γℓ−1 × Γℓ be the set of valid single machine configurations. Notice that S = {S 1 , . . . , S |S| } is of constant size. 5. For a given schedule we define its parallel machine configuration as a vector M ∈ {0, . . . , m}|S| whose i-th component denotes the number of machines having S i as single machine configuration. We only consider vectors M that agree with the concise incoming and outgoing frontier F and F ′ . In other words, if sk denotes the k-th element in Γℓ−1 , and vk the k-th element in Γℓ , we can restrict ourselves to consider vectors 58 M satisfying, X i i:S|Ω ℓ Mi = Fk , =sk |+1 X Mi = Fk′ , i i:S|Ω =vk ℓ |+2 for all k = 1, . . . , |Γℓ−1 | for all k = 1, . . . , |Γℓ |. We also only consider vectors M in which all big orders are completely processed, i.e., if for every p ∈ Qℓ and c ∈ Cℓ , |{j ∈ J : pj = p and j ∈ L ∈ U for L of type c}| = |S| X i=1 Mi · | k ∈ {1, . . . , |Ωℓ |} : Ski = (p, c) |. Define the set of all such possible parallel machine configuration as M. 6. For every parallel machine configuration M in M do the following. (a) For each k = 1, . . . , |S| associate Mk of the machines with the single machine configuration S k arbitrarily. After this process, let us call T (i) the single machine configuration that was associated with machine i. Then, for k = 1, . . . , |Ωℓ | and for each machine i = 1, . . . , m do: i. Call (q, c) = T (i)k the job size and order type given by the single machine configuration associated with machine i at time ωk . ii. Choose the order of type c of largest weight that has a job of size q not yet scheduled, and process it at time ωk in machine i. The schedule thus constructed is a best possible schedule of big orders that agrees with the parallel machine configuration M . (b) Consider small orders as only one job, and schedule them in the available space using list scheduling in non-increasing order of wL /p(L), respecting the incoming and outgoing frontier defined by the configuration M , i.e., in each machine i schedule jobs only between T (i)|Ωℓ |+1 and T (i)|Ωℓ |+2 . If this is not possible consider the cost of the schedule as infinity. 7. For all of the schedules constructed in the two last steps choose the one with lowest 59 cost. As in Algorithm: Localization, steps (1) and (2) are useful for reducing to constant size the number of possible different types of big orders. In steps (3) and (4) we classify single machine schedules by defining single machine configurations. For that we define three sets. The first set, Qℓ , contains the possible sizes a job that belongs to a big order can take. As seen in Section 5.2, the grouping in Step (2) ensures that all of these jobs have processing time greater than ε3 (1 + ε)(s+1)ℓ . Also, since all jobs must be processed inside Bℓ , we can assume that pj ≤ (1 + ε)(s+1)(ℓ+2) . This justifies the definition of set Qℓ . As well, is easy to see that |Qℓ | = 2(s + 1) − 3 log(ε) = O(log(1/ε)). Additionally, this implies that the set Cℓ ⊆ {0, . . . , K}P of possible types of big orders in U , has cardinality K O(log(1/ε)) = O(1). Lemma 5.11. By loosing at most a factor (1 + ε) we can assume that in any schedule all jobs that belongs to big orders start in an instant contained in Ωℓ . Proof. Noting that after Step (2) all jobs belonging to big orders are big jobs. Since Stretch Completion Time will produce a gap of at least ε4 (1 + ε)(s+1)ℓ before each of these jobs, we can move each of them to the left such that its starting time is (1 + ε)(s+1)ℓ + iε4 (1 + ε)(s+1)ℓ for some i = 0, . . . , ⌈((1 + ε)2(s+1) − 1)/ε4 ⌉. 6 6 With this it is easy to see that the cardinality of S is K O(log(1/ε)/ε ) /ε8 = K O(log(1/ε)/ε ) = O(1). Therefore, by construction, the cardinality of the possible parallel machine configuraO(log(1/ε)/ε6 ) tion set M is at most (m + 1)K = mO(1) . In Step (6) we enumerate over all possible parallel machine configurations and construct the schedule of smallest cost. This can be easily done by following the same argument as in previous sections: for a given type of order the one with the largest weight must be processed first. This justifies the schedule of big orders constructed in Step (6.a). Finally, following Lemma 5.8, in Step (6.b) we schedule small orders greedily using Smith’s rule. Overall, this rounding of processing times, rounding of starting times, and the grouping and stretching needed for successfully applying Smith’s rule (see Lemma 5.8), requires extra space in the block to guarantee that our enumeration and greedy processes actually find a near optimal solution. This extra room is added at the end of Step (2) and it is no more than a factor (1 + ε)5 . We can conclude that Algorithm: PTAS for a block gives a near-optimal schedule in block Bℓ for orders in L between the concise frontiers F and F ′ with time moved to the right in a factor (1 + ε)5 . 60 It is important to remark that the same cost is achieved for any permutation of machines. This is useful to reconstruct the optimal solution once the table of the dynamic programming is filled. Since we are only describing the frontier in a concise manner we do not know precisely which machine has which value of frontier. A way to overcome this is to construct the schedule from right to left. First we fix the machine permutation of the outgoing concise frontier of the last block (i.e., we fix a precise outgoing frontier), with this we can compute a specific schedule of the block that complies with such a frontier and with the concise incoming frontier using the previous PTAS. This in turn, fixes a specific incoming frontier, which we use as outgoing frontier of the previous block. Then we have proved the following. Theorem 5.12. Algorithm: PTAS-DP is a polynomial time approximation scheme for P the restricted version of P |part| wL CL when the number of jobs per order is bounded by a constant K. Note that, since n > m for any nontrivial instance, a straightforward calculation shows that the running time of this algorithm is given by nK O(log(K/ε)) /ε2 mK O(log(1/ε)/ε6 ) = nK O(log(K/ε)) mK O(log(1/ε)) = nK O(log(K/ε)/ε6 ) , which is polynomial for fixed K and ε. 5.6 Variations In the last section we showed a PTAS for minimizing the sum of weighted completion times of orders in parallel machines, when the number of jobs per order was a constant. Now we show how to bypass the last assumption by assuming that the number of machines m is a constant independent of the input. Indeed, we will show that the exact same algorithm gives a PTAS for this case. Theorem 5.13. Algorithm: PTAS-DP is a PTAS for P m|part| P wL CL . Proof. It is sufficient to notice that after applying Step (2) of Algorithm: Localization every order that is small w.r.t. its release date consists of only one job, and that every big order w.r.t its release date contains jobs bigger than ε3 rL . Then, since every order finishes within s intervals no one can have more than mrL (1 + ε)s /(ε3 rL ) = O(m) = O(1) jobs in it. 61 The restricted case when the numbers of orders is constant is considerably simpler for two reasons. First, the number of possible subset of orders is also constant, and therefore steps (1) and (2) of Algorithm: PTAS-DP are not necessary: simply define Θℓ as the power set of O. Also, the number of possible types of orders is also constant, and therefore Algorithm: PTAS for a block takes polynomial time. Let us call this modified version Algorithm: PTAS-DP II, then: Theorem 5.14. Algorithm: PTAS-DP II is a polynomial time approximation scheme P for the restricted version of P |part| wL CL when the number of orders bounded by a constant C. A simple, though careful, calculation shows that the running time of Algorithm: PTASO(1/ε6 ) DP II is O(n) · mC , which is polynomial. 62 Chapter 6 Concluding remarks and open problems In this work we studied the machine scheduling problem of minimizing sum of weighted completion time of orders under different environments. In Chapter 3 we first studied some rounding techniques for the special case of minimizing makespan on unrelated machines. We showed how a very naive rounding can transform any preemptive schedule to a nonpreemptive one, without increasing the makespan more than a factor of 4. Then, we proved that this rounding method is best possible by showing a family of almost tight instances. P In Chapter 4 we presented approximation algorithms for R|rij | wL CL and its preempP tive version R|rij , pmpt| wL CL . Both algorithms are based on linear program relaxations, and use, among other things, a rounding technique very similar to the one developed in the previous chapter for the makespan case. Even if this are the only constant factor approximation algorithms known for these problems, the large approximation factor leaves several question open in terms of the approximability of each of them. First, we may ask if the roundings used in the algorithms can be improved. At first glance, what seems more feasible to improve is the naive trimming of the y values (steps 2 of Algorithm: Greedy Preemptive LP and Algorithm: Greedy-LP) . Although not a proof, we showed in Chapter 3 that truncating the variable in a very similar way is best possible for the special case of minimizing makespan. This suggests that this step cannot be significantly improved in the more complex algorithms Greedy Preemptive LP and Greedy-LP. To get a more precise conclusion, it would be interesting to find tight instances for the polyhedrons used on each of the algorithms. One possible direction for this would be to generalize the family 63 of almost tight instance showed in Section 3.3, although it is not clear how to do this. P Recall that the best known hardness result for R|| wL CL derives from the fact that is N P-hard to approximate R||Cmax within a factor better than 3/2. Considering that the algorithm given in this work achieves a performance guarantee of 27/2, it would be interesting to close this gap, or at least diminish it. Given the generality of our model, it seems easier to do this by giving a reduction specifically designed for our problem, improving the 3/2 hardness result for our case. P In Chapter 5 we gave a PTAS for P |part| wL CL , where either the number jobs per order, the number of orders or the number of machines is constant. This generalizes several PTASs P previously known, as the ones for P ||Cmax and P || wj Cj . Thus, it would be interesting to P settle whether the unrestricted case P |part| wL CL is APX-hard. Also, in this chapter we introduced the problem of minimizing the sum of weighted midpoints of jobs on a variable speed machine, proving that it can be polynomially solved by a greedy algorithm. Also, we briefly discussed the problem of minimizing the sum of weighted completion times on a variable speed machine. This problem, which can be proved to be N P-hard, has not known constant factor approximation algorithm, nor a proof showing that this cannot be accomplished. Answering this question would be of great interest given the very natural settings where this problem could arise. Finally, another possible direction for further research is to study the problem of minimizing the sum of weighted completion times of orders on an online setting. In this variant orders arrive over time and no information is known about an order before it has arrived. In online problems we are interested in comparing the cost of our solution to the cost of the optimal solution on the offline setting, where all the information is known since time 0. To this goal the notion of α-points (see for example [18, 5, 35, 10]) has proven useful for the problem of minimizing the sum of weighted times of jobs, and thus it would be interesting to study the use of this technique in our more general setting. 64 Bibliography [1] F. Afrati, E. Bampis, C. Chekuri, D. Karger, C. Kenyon, S. Khanna, I. Milis, M. Queyranne, M. Skutella, C. Stein, M. Sviridenko, 1999. “Approximation schemes for minimizing average weighted completion time with release dates.” Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 32–43. [2] C. Ambühl and M. Mastrolilli, 2006. “Single Machine Precedence Constrained Scheduling is a Vertex Cover Problem”, Proceedings of the 14th Annual European Symposium on Algorithms (ESA), LNCS 4168, 28–39. [3] C. Ambühl, M. Mastrolilli, O. Svensson, 2007. “Inapproximability Results for Sparsest Cut, Optimal Linear Arrangement, and Precedence Constrained Scheduling.” Proccedings of the 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 329–337. [4] C. Chekuri, R. Motwani, 1999. “Precedence constrained scheduling to minimize sum of weighted completion times on a single machine”. Discrete Applied Mathematics, 98:29– 38. [5] C. Chekuri, R. Motwani, B. Natarajan, C. Stein, 2001. “Approximation techniques for average completion time scheduling”, SIAM Journal on Computing, 31:146–166. [6] F. Chudak, D. S. Hochbaum, 1999. “A half-integral linear programming relaxation for scheduling precedence-constrained jobs on a single machine”. Operations Research Letters, 25:199–204. [7] Z. Chen and N.G. Hall, 2001. “Supply chain scheduling: assembly systems.” Working Paper, Department of Systems Engineering, University of Pennsylvania. 65 [8] Z. Chen and N.G. Hall, 2007. “Supply chain scheduling: conflict and cooperation in assembly systems.” Operations Research, 55:1072–1089. [9] R. W. Conway, W. L. Maxwell, and L. W. Miller, 1967. “Theory of Scheduling”, AddisonWesley, Reading, Mass. [10] J. R. Correa, M. Wagner, 2005. “LP-Based Online Scheduling: From Single to Parallel Machines”. Proceedings of the 11th Conference on Integer Programming and Combinatorial Optimization (IPCO), 3509:196–209. [11] Stephen Cook, 1971. “The complexity of theorem proving procedures”. Proceedings of the 3rd Annual ACM Symposium on Theory of Computing, 151–158. [12] J.R. Correa and A.S. Schulz, 2005. “Single Machine Scheduling with Precedence Constraints.” Mathematics of Operations Research, 30:1005–1021. [13] M.E. Dyer and L. A. Wolsey, 1999. “Formulating the single machine sequencing problem with release dates as a mixed integer program.” Discrete Applied Mathematics, 26:255– 270. [14] L. Danzer, B. Grunbaum, and V. Klee, 1963. “Helly’s theorem and its relatives, in “Convexity” “. Proceedings of the Symposium in Pure Mathematics, 7:101–180. [15] W.L. Eastman, S. Even, I.M. Isaacs, 1964. “Bounds for the optimal scheduling of n jobs on m processors”, Management Science, 11:268–279. [16] J. Eckhoff, 1993. ”Helly, Radon, and Caratheodory type theorems“. In P. M. Gruber and J. M. Wills, editors, Handbook of Convex Geometry, 389–448, North-Holland, Amsterdam. [17] M.R. Garey, D.S. Johnson, 1979. “Computers and Intractability: A Guide to the Theory of NP-completness”. Freeman, New York. [18] M. X. Goemans, 1997. “Improved approximation algorithms for scheduling with release dates”. Proceedings of the 8th Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, 591–598. [19] R. L. Graham, 1966. “Bounds for certain multiprocessing anomalies,” Bell Systems Technical Journal, 45:1563–1581. 66 [20] R.L. Graham, E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, 1979. “Optimization and approximation in deterministic sequencing and scheduling: a survey”. Annals of Discrete Mathematics, 5:287–326. [21] L. A. Hall, A. S. Schulz, D. B. Shmoys, J. Wein, 1997. “Scheduling to minimize average completion time: off-line and on-line approximation algorithms”. Mathematics of Operations Research, 22:513–544. [22] D. Hochbaum and D. Shmoys, 1987. “Using dual approximation algorithm for scheduling problems: Theoretical and practical results”, Journal of the ACM, 34:144–162. [23] H. Hoogeveen, P. Schuurman, G. J. Woeginger, 2001, “Non-approximability results for scheduling problems with minsum criteria”, INFORMS Journal on Computing, 13:157– 168. [24] R. M. Karp, 1972. “Reducibility Among Combinatorial Problems”. In R. E. Miller and J. W. Thatcher, editors, Complexity of Computer Computations, Plenum, New York, 85–103. [25] E. L. Lawler and , J. Labetoulle, 1978. “On Preemptive Scheduling of Unrelated Parallel Processors by Linear Programming”. Journal of the ACM 25:612–619. [26] J. K Lenstra, A. H. G. Rinnooy Kan, 1978. “Complexity of scheduling under precedence constrains”. Operations Research, 26:22–35. [27] J. K. Lenstra, D.B Shmoys and É. Tardos, 1990, “Approximation algorithms for scheduling unrelated parallel machines”, Mathematical Programming, 46:259–271. [28] J. Leung, H. Li, and M. Pinedo, 2006. “Approximation algorithm for minimizing total weighted completion time of orders on identical parallel machines.” Naval Research Logistics, 53:243–260. [29] J. Leung, H. Li, M. Pinedo and J. Zhang, 2007. “Minimizing Total Weighted Completion Time when Scheduling Orders in a Flexible Environment with Uniform Machines.” Information Processing Letters, 103:119–129. [30] J. Leung, H. Li, and M. Pinedo, 2007. “Scheduling orders for multiple product types to minimize total weighted completion time.” Discrete Applied Mathematics, 155:945–970. 67 [31] L. Levin, 1973. “Universal sorting problems”, Problems in Information Transmission, 9:165–266. [32] F. Margot, M. Queyranne, Y. Wang, 2003. “Decompositions, network flows, and a precedence constrained single machine scheduling problem.” Operations Research, 51:981–992. [33] M. Queyranne, 1993. “Structure of a simple scheduling polyhedron”, Mathematical Programming, 58:263–285. [34] A. Schrijver, 2004. “Combinatorial Optimization”, Springer-Verlag, Germany, Volumen A. [35] A. Schulz and M. Skutella, 2002, “Scheduling unrelated machines by randomize rounding”, SIAM Journal on Discrete Mathematics, 15:450–469. [36] A. S. Schulz and M. Skutella, 1997. “Random-based scheduling: New approximations and LP lower bounds”. In J. Rolim, editor, Randomization and Approximation Techniques in Computer Science, LNCS 1269, 119–133. [37] D. B. Shmoys, E. Tardos, 1993. “An approximation algorithm for the generalized assignment problem”. Mathematical Programming, 62:561–474. [38] M. Skutella, 2001. “Convex quadratic and semidefinite programming relaxations in scheduling”, Journal of the ACM, 48:206–242. [39] M. Skutella, 2002. “List Scheduling in Order of α-Points on a Single Machine”. In E. Bampis, K. Jansen and C. Kenyon, editors, Efficient Approximation and Online Algorithms, Springer-Verlag, Berlin, 250–291. [40] M. Skutella and G. J. Woeginger, 2000. “Minimizing the total weighted completion time on identical parallel machines,” Mathematics of Operations Research, 25:63–75. [41] W. E. Smith, 1956. “Various optimizers for single-stage production.” Naval Research Logics Quarterly, 3:59-66. [42] V. Vazirani, 2001. “Approximation Algorithms”. Springer-Verlag, New York. [43] G. J. Woeginger, 2003. “On the approximability of average completion time scheduling under precedence constraints”. Discreet Applied Mathematics, 131:237–252. 68 [44] J. Yang and M.E. Posner, 2005. “Scheduling parallel machines for the customer order problem.” Journal of Scheduling, 8:49–74. 69

universidad de chile facultad de ciencias físicas y matem´aticas

Related documents

Products

Support

universidad de chile facultad de ciencias físicas y matem´aticas

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib