UNIVERSIDAD DE CHILE FACULTAD DE CIENCIAS FÍSICAS Y MATEMÁTICAS DEPARTAMENTO DE INGENIERÍA MATEMÁTICA ALGORITMOS DE APROXIMACIÓN PARA PROBLEMAS DE PROGRAMACIÓN DE ÓRDENES EN MÁQUINAS PARALELAS MEMORIA PARA OPTAR AL TÍTULO DE INGENIERO CIVIL MATEMÁTICO JOSÉ CLAUDIO VERSCHAE TANNENBAUM PROFESOR GUÍA: JOSÉ RAFAEL CORREA HAEUSSLER MIEMBROS DE LA COMISIÓN: MARCOS ABRAHAM KIWI KRAUSKOPF ROBERTO MARIO COMINETTI COTTI-COMETTI SANTIAGO DE CHILE AGOSTO 2008 RESUMEN DE LA MEMORIA PARA OPTAR AL TÍTULO DE INGENIERO CIVIL MATEMÁTICO POR: JOSÉ C. VERSCHAE T. FECHA: 06/08/2008 PROF. GUÍA: Sr. JOSÉ R. CORREA “ALGORITMOS DE APROXIMACIÓN PARA PROBLEMAS DE PROGRAMACIÓN DE ÓRDENES EN MÁQUINAS PARALELAS” El presente trabajo de tı́tulo tuvo como objetivo estudiar problemas de programación de órdenes en máquinas. En este problema un productor dispone de una cierta cantidad de máquinas en las que debe procesar un conjunto de trabajos. Cada trabajo pertenece a una orden, correspondiente a un pedido de algún cliente. Por otra parte, los trabajos tienen asociado un tiempo de procesamiento, que puede depender de la máquina en que es procesado, y una fecha de disponibilidad a partir de la cual el trabajo puede ser programado. Finalmente, a cada orden se le asocia un peso que depende de cuán importante es la orden para el productor. El tiempo de completación de una orden es el instante de tiempo en que todos sus trabajos han sido procesados. El problema del productor es decidir cuándo y en qué máquina se procesa cada trabajo con el objetivo de minimizar la suma ponderada de los tiempos de completación de las órdenes. Este modelo generaliza varios problemas clásicos del área de programación de tareas. Por una parte, la función objetivo en nuestro modelo incluye como caso especial minimizar el tiempo total de procesamiento (makespan) y la suma ponderada de los tiempos de completación de los trabajos. Por otra parte, en esta memoria veremos que nuestro modelo también generaliza el problema de minimizar la suma ponderada de tiempos de completación de los trabajos en una máquina sujeto a restricciones de precedencia. Al ser estos problemas N P-duros, su aparente intratabilidad sugiere buscar algoritmos eficientes que entreguen una solución cuyo costo sea cercano al valor óptimo. Es con este objetivo que, basándose en relajaciones lineales indexadas en el tiempo, se propuso un algoritmo de 27/2-aproximación para la versión más general del problema descrito anteriormente. Este es el primer algoritmo con una garantı́a de aproximación constante para este problema, lo que mejora el resultado de Leung, Li y Pinedo (2007). Basado en técnicas similares, para el caso en que los trabajos pueden interrumpirse, también se obtuvo un algoritmo con una garantı́a de aproximación arbitrariamente cercana a 4. Además, se encontró un esquema de aproximación a tiempo polinomial (PTAS) para el caso en que las ordenes son disjuntas, y cuando se dispone de una cantidad constante de máquinas idénticas. Más aún, se pudo concluir que una variante de este esquema de aproximación se puede aplicar en el caso en que la cantidad de máquinas es parte de la entrada del algoritmo, pero la cantidad de trabajos por orden o la cantidad de órdenes es constante. Finalmente, se estudió el problema de minimizar el makespan en máquinas no relacionadas. Se propuso un algoritmo que transforma una solución con trabajos interrumpibles a una donde ningún trabajo es interrumpido, aumentando el makespan en a lo más un factor 4. Más aún, se demostró que no es posible encontrar un algoritmo que haga lo mismo incrementando el makespan en un factor menor. i Agradecimientos En primer lugar me gustarı́a agradecer a mis padres y hermanos, que desde niño me inculcaron el gusto por pensar. Su apoyo constante me ayudó toda mi carrera a seguir adelante. Agradezco a mi hermano Rodrigo que siempre estuvo dispuesto a escucharme y a discutir mi redacción. A mi amada esposa Natalia, que con su ayuda, cariño, paciencia y apoyo incondicional me ayudó a sacar adelante esta memoria. Muy especialmente le agradezco a mi profesor guı́a José R. Correa, que con mucha paciencia me introdujo al mundo de la investigación. Más que solo ayudarme y encaminar mi trabajo, me brindó amistad, comprensión y apoyo en general. Sin su constante apoyo esta memoria no podrı́a haberse llevado a cabo. A todos los alumnos del departamento de matemáticas de la U. de Chile, por siempre estar dispuestos a conversar y subirme el ánimo. Agradezco también a Martin Skutella quién me acogió en mi pasantı́a en Alemania durante Septiembre y Octubre del 2007. Gracias a su colaboración e importantes aportes se pudo desarrollar el Capı́tulo 5 de esta memoria. También agradezco a todo su grupo en la TUBerlin, por hacerme mi estadı́a en Berlin muy agradable. También a Nicole Megow por brindarme su amistad y apoyo. ii Acknowledgments First of all, I want to thank my parents and brothers for instilling in me the love of thinking. Their constant support helped me throughout all my career. I thank my brother Rodrigo for always listen to me and discuss my writing. To my loving wife Natalia, that with her help, love, patience and unconditional support helped me finishing this thesis. I specially thank my advisor José R. Correa, that through long hours of discussions introduced me to the world of investigation. More than only help me in my work, he gave me friendship and support in general. Without his constant support this thesis would not have carried out successfully. To all the students of the mathematics department of the University of Chile, for always being willing to talk and cheer me up. I also thank Martin Skutella who received me in my staying in Germany during September and October 2007. His collaboration and important contributions made possible Chapter 5 of this writing. I also thank all his group in TU-Berlin, for making pleasant my staying in Berlin. I also thank Nicole Megow for offering me her friendship and support. iii Contents 1 Resumen en español vi 1.1 Problemas de programación de tareas en máquinas . . . . . . . . . . . . . . vi 1.2 Algoritmos de aproximación . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1.3 Esquemas de aproximación a tiempo polinomial . . . . . . . . . . . . . . . . xiii 1.4 Definición del problema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii 1.5 Trabajo previo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv 1.5.1 Una máquina . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv 1.5.2 Máquinas paralelas . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi 1.5.3 Máquinas no relacionadas . . . . . . . . . . . . . . . . . . . . . . . . xvii Contribuciones de este trabajo . . . . . . . . . . . . . . . . . . . . . . . . . . xviii 1.6.1 Capı́tulo 3: El poder de la interrumpibilidad para R||Cmax . . . . . . xviii 1.6.2 Capı́tulo 4: Algoritmos de aproximación para minimizar P wL CL en máquinas no relacionadas . . . . . . . . . . . . . . . . . . P Capı́tulo 5: Un PTAS para minimizar wL CL en máquinas paralelas 1.6 1.6.3 1.7 Conclusiones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Introduction xxi xxv xxix 1 2.1 Machine scheduling problems . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2.2 Approximation algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3 Polynomial time approximation schemes . . . . . . . . . . . . . . . . . . . . 7 2.4 Problem definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.5 Previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.5.1 Single machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.5.2 Parallel machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.5.3 Unrelated machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 iv 2.6 3 On 3.1 3.2 3.3 Contributions of this work . . . . . . . . . . . . . . . . . . . . . . . . . . . . the power of preemption on R||Cmax R|pmtn|Cmax is polynomially solvable . A new rounding technique for R||Cmax Power of preemption of R||Cmax . . . . 3.3.1 Base case . . . . . . . . . . . . 3.3.2 Iterative procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 18 18 22 24 25 26 P 4 Approximation algorithms for minimizing wL CL on unrelated machines 32 4.1 A (4 + ε)−approximation algorithm for P R|rij , pmtn| wL CL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 P 4.2 A constant factor approximation for R|rij | wL CL . . . . . . . . . . . . . . 37 P 5 A PTAS for minimizing wL CL on parallel machines 5.1 Algorithm overview. . . . . . . . . . . . . . . . . . . . 5.2 Localization . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Polynomial Representation of Order’s Subsets . . . . . 5.4 Polynomial Representation of Frontiers . . . . . . . . . 5.5 A PTAS for a specific block . . . . . . . . . . . . . . . 5.6 Variations . . . . . . . . . . . . . . . . . . . . . . . . . 6 Concluding remarks and open problems v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 44 46 53 55 57 61 63 Capı́tulo 1 Resumen en español 1.1 Problemas de programación de tareas en máquinas Los problemas de programación de tareas en máquinas (machine scheduling) tratan sobre la asignación de recursos escasos a través del tiempo. Ellos surgen en distintos escenarios, como por ejemplo, un sitio de construcción donde el jefe debe asignar trabajos a cada empleado, una CPU que debe procesar tareas requeridas por varios usuarios, o en las lı́neas de producción de una fábrica que debe procesar productos para sus clientes. En general, una instancia de un problema de programación en máquinas consiste en un conjunto de n trabajos J y un conjunto de m máquinas M . Una solución del problema, i.e. una programación de las tareas o schedule, es una asignación que especifica en que máquina i ∈ M y en que instante de tiempo se procesa cada trabajo. Para clasificar problemas de programación de tareas debemos mirar las distintas caracterı́sticas o atributos de las máquinas y los trabajos, además de la función objetivo a minimizar. Una de estas es el “ambiente de máquinas”, que describe la configuración de las máquinas en nuestro modelo. Por ejemplo, podemos considerar maquinas “idénticas” o “paralelas”, donde cada máquina es una copia idéntica de todas las otras. En este caso cada trabajo j ∈ J toma pj unidades de tiempo en procesarse, independiente de la máquina al cual fue asignado. Por otro lado, podemos considerar una situación más general donde cada máquina i ∈ M tiene una velocidad asociada si , tal que el tiempo que toma un trabajo j en procesarse es inversamente proporcional a la velocidad de la máquina. Adicionalmente, los problemas de programación de tareas pueden se clasificados según las caracterı́sticas de los trabajos. Por ejemplo, nuestro modelo podrı́a considerar trabavi jos “ininterrumpibles”, i.e. trabajos que una vez que se empiezan a procesar no pueden ser interrumpidos hasta que hayan sido completamente procesados. Por otro lado, también podrı́amos considerar trabajos “interrumpibles”, donde tenemos la libertad de interrumpir un trabajo que haya empezado a procesarse, para después reanudar su procesamiento en la misma máquina u otra distinta. Por último, podemos clasificar problemas dependiendo de su función objetivo. Una de las funciones objetivos más naturales corresponde a minimizar el makespan, que se define como el instante de tiempo donde el último trabajo termina de procesarse. Más precisamente, si para una cierta programación de tareas definimos el “tiempo de completación” de un trabajo j ∈ J, denotado Cj , como el instante donde j termina de procesarse, entonces minimizar el makespan corresponde a minimizar Cmax := maxj∈J Cj . Otro ejemplo clásico de función objetivo consiste en minimizar el número de trabajos tardı́os. En este caso, cada trabajo j ∈ J tiene asociado una “fecha de entrega” (deadline) dj , y el objetivo es el de minimizar la cantidad de trabajos que terminan de ser procesados después de su fecha de entrega. Como éstas, hay una gran cantidad de distintas funciones objetivo que pueden ser consideradas. Una gran cantidad de problemas de programación de tareas pueden formarse combinando las caracterı́sticas recién mencionadas. Es por esto que es conveniente introducir una notación estándar que describa cada uno de estos problemas. Graham, Lawler, Lenstra y Rynooy Kan [20] propusieron la “notación de tres campos”, donde un problema de programación de tareas es representado por una expresión de la forma α|β|γ donde α denota el ambiente de máquinas, β contiene restricciones extras o caracterı́sticas del problema, y el último campo γ denota la función objetivo. En lo que sigue describimos algunos valores comunes que toma cada campo α, β y γ. 1. Valores de α. • α = 1 : Una máquina. Tenemos una sola máquina para procesar los trabajos. Cada trabajo j ∈ J toma un tiempo dado pj en ser procesado. • α = P : Máquinas paralelas. Tenemos un número m de máquinas idénticas o paralelas donde procesar los trabajos. Por ende el tiempo de proceso de un trabajo j está dado por pj , el cuál no depende de la máquina donde j es asignado. • α = Q: Máquinas relacionadas. En este caso cada máquina i ∈ M tiene una velocidad si asociada. Con esto, el tiempo de proceso del trabajo j ∈ J en la máquina i esta dado por pj /si , donde pj es el tiempo que demora en procesarce el vii trabajo j una máquina de velocidad 1. • α = R: Máquinas no relacionadas. En este caso más general no hay ninguna relación a priori entre los tiempos de proceso de cada trabajo en cada máquina, con lo que el tiempo que toma en procesar un trabajo j en una máquina i está descrito por un número arbitrario pij . Adicionalmente, en el caso de que α = P, Q o R, podemos añadir la letra m al final del campo, indicando que la cantidad de máquinas disponibles es constante. Luego, por ejemplo, si el modelo de máquinas paralelas considera un número fijo m de máquinas lo denotamos por α = P m. El valor de m también puede ser especificado, e.g., α = P 2 significa que disponemos de dos máquinas paralelas para procesar los trabajos. 2. Valores de β. • β = pmpt: Trabajos interrumpibles. En este caso consideramos trabajos que si pueden ser interrumpidos (una o varias veces) antes de ser terminados, los cuales deben completarse posteriormente en la misma máquina u en otra distinta. • β = rj : Tiempos de disponibilidad. Cada trabajo tiene asociado un instante de tiempo rj a partir del cuál puede empezar a ser procesado. • β = prec: Restricciones de precedencia. Consideremos una relación de orden parcial sobre los trabajos (J, ≺). Si para algún par de trabajos j y k se tiene que j ≺ k, entonces k debe empezar a procesarse después de el tiempo de completación del trabajo j. 3. Valores de γ. • γ = Cmax : Makespan. El objetivo es minimizar el makespan dado por Cmax := maxj∈J Cj , donde Cj corresponde al instante de tiempo donde el trabajo j terminó de procesarse, i.e. a su tiempo de completación P • γ = Cj : Tiempo de completación promedio. Se debe minimizar el tiempo de completación promedio, o equivalentemente, la suma de los tiempos de completaP ción j∈J Cj . P • γ= wj Cj : Suma ponderada de tiempos de completación. Se considera un peso wj asociado a cada trabajo j ∈ J, que describe cuan importante es tal trabajo. viii Luego, el objetivo es minimizar la suma ponderada de los tiempo de completación, P j∈J wj Cj . Cabe destacar que por convención se consideran trabajos ininterrumpibles por defecto. Es decir, cuando el campo β esta vacı́o significa que los trabajos no pueden ser interrumpidos P hasta que sean completados. Por ejemplo, R|| wj Cj denota el problema de encontrar una programación de tareas en J sobre el conjunto de máquinas M , sin interrumpir ningún trabajo, donde cada trabajo j ∈ J toma pij unidades de tiempo en procesarse en la máquina P i ∈ M , y se minimiza la suma ponderada de los tiempos de completación wj Cj . Por otra P parte, R|pmpt| wj Cj denota el mismo el problema recién descrito, con la diferencia de que permitimos interrupciones de los trabajos. Además, notemos que el campo β puede tomar P más de un solo valor. Por ejemplo, R|pmpt, rj | wj Cj es el mismo que el problema anterior, con la restricción adicional de que cada trabajo j debe empezar a procesarse después del tiempo rj . De todos los problemas de programación de tareas, hay una gran cantidad que son N Pduros, y por lo tanto no admiten algoritmos a tiempo polinomial que los resuelvan a optimalidad, a menos que P = N P. En particular, es fácil demostrar que uno de los problemas fundamentales en el área, P 2||Cmax , es N P-completo. En lo que sigue describiremos algunas técnicas generales que se pueden ocupar para abordar problema de optimización que son N P-duros, además de algunas de sus aplicaciones al área de programación de tareas. 1.2 Algoritmos de aproximación La introducción de la clase de problemas N P-completos dada por Cook [11], Karp [24] e independientemente Levin [31], dejó grandes desafios abiertos sobre como estos pueden ser abordados dada su aparente intratabilidad. Una opción que ha sido estudiada a profundidad es la de algoritmos que resuelven el problema a optimalidad, pero que no tienen una cota superior polinomial sobre el tiempo de ejecución. Este tipo de algoritmos puede ser útil sobre instancias de tamaño pequeño o mediano, o para instancias con alguna estructura especial en donde los tiempos de ejecución son realizables en la práctica. Sin embargo, puede haber otro tipo de instancias donde el algoritmo toma un tiempo exponencial en terminar, lo cual restringe su utilidad práctica. Los más comunes de estos enfoques son Branch and Bound, Branch and Cut y técnicas de Programación Entera. Para los problemas de optimización N P-duros, otra alternativa es la de usar algoritmos ix que corren en tiempo polinomial, pero que no resolverán el problema a optimalidad. Entre estos, una clase particularmente interesante de algoritmos son los “algoritmos de aproximación”, en donde la solución calculada está garantizada de tener un costo cercano al del óptimo. Más precisamente, consideremos un problema de minimización P con función de costos c. Para un cierto α ≥ 1, decimos que una solución S de P es una “α-aproximación” si su costo c(S) está a un factor α del costo óptimo OP T , i.e., si c(S) ≤ α · OP T. (1.1) Con esto, consideremos un algoritmo A cuya salida sobre una instancia I esta dada por A(I). Decimos que A es un algoritmo de “α-aproximación” si A(I) es una α-aproximación para cualquier instancia I. El número α se llama el “factor de aproximación” de A, y si α no depende de la entrada de A decimos que el algoritmo es un algoritmo de aproximación a un factor constante. Análogamente, si ahora P es un problema de maximización con función objetivo c, una solución S es una α-aproximación, para un α ≤ 1, si c(S) ≥ α · OP T. Como antes, un algoritmo A es un algoritmo de α-aproximación si A(I) es una α-aproximación para cada instancia I de P . En lo que sigue, solamente estudiaremos problemas de minimización, por lo que esta última definición no será usada. Uno de los primeros algoritmos de aproximación fue dado por R.L. Graham [19] en 1966, incluso antes de que la noción de N P-completitud haya sido formalmente introducida. Graham estudió el problema de minimizar el makespan en máquinas paralelas, P ||Cmax , proponiendo el siguiente algoritmo glotón: (1) Ordenar los trabajos arbitrareamente, (j1 , . . . , jn ); (2) Para cada k = 1, . . . , n, procesar el trabajo jk en la máquina en donde terminarı́a primero. A un procedimiento de este tipo se le llama un algoritmo de list-scheduling. Lema 1.1 (Graham 1966 [19]). List-scheduling es un algoritmo de (2 − 1/m)-aproximación para P ||Cmax . Demostración. Primero notemos que si OP T denota el makespan de la solución óptima, entonces 1 X OP T ≥ pj , (1.2) m j∈J x ya que el lado derecho de esta ecuación corresponde al tiempo promedio de procesamiento de las m máquinas, que debe ser menor que el makespan óptimo. Ahora bien, sea ℓ ∈ {1, . . . , n} tal que Cjℓ = Cmax , y denotemos por Sj = Cj − pj el tiempo en que se comienza a procesar el trabajo j. Luego, notando que en el instante Sjℓ todas las máquinas están ocupadas, se tiene que ℓ−1 1 X S jℓ ≤ pj , m k=1 k con lo cual ℓ Cmax 1 1 X pjk + (1 − )pjℓ ≤ = S jℓ + p jℓ ≤ m k=1 m 1 2− m OP T, (1.3) donde la última desigualdad sigue de (1.2) y del hecho de que pjℓ ≤ OP T , ya que ninguna programación de tareas puede terminar antes de pj para ningún j ∈ J. Como se puede observa en la demostración, un paso crucial en el análisis previo es el de obtener una “buena” cota inferior de la solución óptima (por ejemplo Ecuación (1.2) en el lema previo), para después usarla para acotar superiormente la solución dada por el algoritmo (como en la Ecuación (1.3)). La mayorı́a de las técnicas para encontrar cotas inferiores del óptimo dependen del problema, y por ende es difı́cil dar reglas generales de como hallarlas. Una de las pocas excepciones que ha mostrado ser útil en una diversidad de contextos, corresponde a formular el problema como un problema de programación lineal entero, y luego relajar las condiciones de integralidad. Claramente, la solución del problema relajado es una cota inferior del problema original. Un algoritmo que ocupa esta técnica para encontrar cotas inferiores es comúnmente llamado “algoritmo de aproximación basado en un PL”. Para ilustrar esta idea consideramos el siguiente problema. Vertex-Cover de Costo Mı́nimo: Entrada: Un grafo G = (V, E) y una función de costos c : V → Q sobre los vértices. Objetivo: Encontrar un vertex-cover , i.e. un conjunto B ⊆ V tal que cada arista en E P intersecta a algún vértice en B, minimizando el costo c(B) = v∈B c(v). Es fácil ver que este problema es equivalente al siguiente problema entero: xi [PL] min X yv c(v) (1.4) v∈V yv + yw ≥ 1 yv ∈ {0, 1} para todo vw ∈ E, (1.5) para todo v ∈ V. (1.6) Luego, reemplazando la Ecuación (1.6) por yv ≥ 0, obtenemos un programa lineal cuyo valor óptimo es una cota inferior del problema de Vertex-Cover de Costo Mı́nimo. Para obtener un algoritmo de aproximación a un factor constante, procedemos como sigue. Primero resolvemos [PL] (ocupando, por ejemplo, el método del elipsoide), y llamamos la solución yv∗ . Para redondear esta solución fraccionaria, notemos que la Ecuación (1.5) implica que para cada arista vw ∈ E, ya sea yv∗ ≥ 1/2 o yw∗ ≥ 1/2. Luego, el conjunto B = {v ∈ V |yv∗ ≥ 1/2} es un vertex-cover, y más aún podemos acotar su costo de la siguiente manera, c(B) = X v:yv∗ ≥1/2 c(v) ≤ 2 X v∈V yv∗ c(v) ≤ 2OP TP L ≤ 2OP T, (1.7) donde OP T es el valor óptimo del problema de Vertex-Cover de Costo Mı́nimo, y OP TP L es el óptimo de [PL]. Luego el algoritmo descrito es una 2-aproximación. Notando que OP T ≤ c(B), la Ecuación (1.7) implica que OP T ≤2 OP TP L para cualquier instancia I de Vertex-Cover de Costo Mı́nimo. Más generalmente, cualquier algoritmo de α-aproximación que usa OP TP L como cota inferior debe satisfacer max I OP T ≤ α. OP TP L El lado izquierdo de esta última ecuación se llama el “gap de integralidad” del programa lineal. Encontrar una cota inferior sobre el gap de integralidad de un programa lineal es una técnica usual para determinar cual es el mejor factor que puede tener un algoritmo de aproximación que ocupa el programa lineal como cota inferior. Para hacer esto basta con encontrar una instancia con un cociente OP T /OP TP L lo más grande posible. Por ejemplo, es fácil ver que el algoritmo recién descrito para Vertex-Cover de Costo Mı́nimo es el xii mejor posible ocupando [PL] como cota inferior. En efecto, si consideramos el grafo G como el grafo completo de n vértices y función de costos c ≡ 1, obtenemos que OP T = n − 1 y OP TP L = n/2, con lo que OP T /OP TP L → 2 cuando n → ∞. 1.3 Esquemas de aproximación a tiempo polinomial Dado un cierto problema de optimización N P-duro, es natural preguntarse cual es el algoritmo de aproximación que corre en tiempo polinomial con el mejor factor de aproximación. Claramente, esto depende de cada problema. Por un lado, hay algunos problemas que no admiten ningún algoritmo de aproximación a menos que P = N P. Por ejemplo, el problema del vendedor viajero con costos binarios no puede ser aproximado a ningún factor. En efecto, si existe un algoritmo de α-aproximación para este problema, entonces podrı́amos decidir si es que existe o no un circuito hamiltoniano de costo cero: Si la solución óptima del problema del vendedor viajero es cero, el algoritmo de aproximación debe retornar cero por (1.1), independiente del valor de α; Si el valor del óptimo es mayor que cero, la solución dado por el algoritmos también lo será. Por otro lado, hay algunos problemas que admiten algoritmos con factores de aproximación arbitrareamente buenos. Para formalizar esta idea definimos un “esquema de algoritmos de aproximación en tiempo polinomial” (polynomial time approximation scheme o PTAS), como una familia de algoritmos {Aε }ε>0 tal que cada Aε es un algoritmo de (1+ε)-aproximación que corre en tiempo polinomial. Cabe destacar que ε no es considerado como parte del input, y por lo tanto el tiempo de ejecución del algoritmo puede depender exponencialmente de ε. Por otra parte, un PTAS donde el tiempo de ejecución de los Aε dependen polinomialmente en 1/ε se le llama un “esquema de algortimos de aproximación totalmente polinomial” (fully polynomial time approximation scheme o FPTAS). 1.4 Definición del problema En esta memoria estudiaremos un problema de programación de tareas que surge naturalmente en la industria. Consideremos una situación donde clientes hacen pedidos, consistentes en varios productos, a un fabricante que dispone de un conjunto de máquinas para procesarlas. Cada producto debe ser procesado en alguna de las m máquinas disponible para ello, y el tiempo que se demora en procesar cada trabajo puede depender de la máquina en donde xiii es programado. El productor debe decidir una programación de las tareas con el objetivo de entregar el mejor servicio posible a sus clientes. En su versión más general, el problema a considerar es como sigue. Tenemos un conjunto S de trabajos J y un conjunto de órdenes O ⊆ P(J), tal que L∈O L = J. Cada trabajo j ∈ J tiene asociado un valor pij que representa su tiempo de proceso en la máquina i ∈ M . Además, a cada orden L ∈ O le asociamos un peso wL , dependiendo de cuan importante es la orden para el productor. Por otra parte, cada trabajo j tiene asociado un tiempo de disponibilidad dependiente de la máquina donde el trabajo es procesado, rij , tal que j no puede comenzar a procesarse en i antes de rij . Una orden esta completada cuando todos los trabajos pertenecientes a ella han sido completamente procesados. Luego, si Cj denota el instante de tiempo en donde se termino de procesar el trabajo j, entonces CL = max{Cj : j ∈ L} corresponde al tiempo de completación de la orden L ∈ O. El objetivo del productor es el de encontrar una programación de las tareas (sin interrumpirlas) en las m máquinas tal que se minimice la “suma ponderada de los tiempos de completación de las órdenes”, i.e., min X wL CL . L∈O Cabe destacar que en lo recién descrito permitimos que un trabajo pertenezca a más de una orden simultaneamente, i.e., las órdenes en O pueden no ser dos a dos disjuntas. Para adoptar la notación de tres campos de Graham et al., denotaremos este problema P P como R|rij | wL CL , o R|| wL CL en el caso en que todos los tiempos de disponibilidad sean 0. Cuando los tiempo de proceso pij no dependan de la máquina, cambiaremos la “R” por una “P ”, indicando que estamos en el caso de máquinas paralelas. Además, cuando imponemos la condición extra de que las órdenes forman una partición de J añadimos part como parte del segundo campo β. Como mostraremos más adelante, nuestro problema generaliza varios problemas clásicos P P de programación de tareas. Estos incluyen R||Cmax , R|rij | wj Cj y 1|prec| wj Cj . Ya que todos estos problemas son fuertemente N P-duro (ver por ejemplo [17]), también nuestro problema más general lo será. Sorprendentemente, el mejor factor de aproximación para cada uno de estos problemas es de 2 [4, 35, 37]. Sin embargo, en nuestro caso más general, ningún algoritmo de aproximación a factor constante es conocido. El mejor resultado, dado por Leung, Li, Pinedo y Zhang [29], es un algoritmo para el caso especial de máquinas relacionadas (i.e. pij = pj /si , donde si xiv denota la velocidad de la máquina i) y todos los tiempos de disponibilidad son cero. El factor de aproximación de este algoritmo es 1 + ρ(m − 1)/(ρ + m − 1), donde ρ es la razón entre las velocidades de la máquina más rápida y la más lenta. En general esta cota no es constante, y puede ser tan mala como m/2. 1.5 Trabajo previo Para ilustrar la flexibilidad de nuestro modelo, presentamos varios modelo que nuestro problema generaliza, además de dar una reseña de los resultados más importantes conocidos sobre ellos. 1.5.1 Una máquina Comenzamos considerando el problema de minimizar la suma ponderada de tiempos de completación de órdenes en una máquina. Primero estudiamos el caso donde ningún trabajo P P pertenece a más de una orden, 1|part| wL CL , mostrando que es equivalente a 1|| wj Cj . Smith [41] mostró que este último problema se puede resolver en tiempo polinomial ocupando un algoritmo de tipo list-scheduling ordenando los trabajos de manera no-creciente en wj /pj . Este algoritmo glotón es conocido como la “regla de Smith”. Para ver que los dos problemas mencionados son en efecto equivalentes, primero mosP tramos que existe una solución óptima de 1|part| wL CL en donde todos los trabajos de una orden L ∈ O son procesados de manera consecutiva. Para ver esto, consideremos una solución óptima donde esto no ocurre. Luego, existen trabajos j, ℓ ∈ L y k ∈ L′ 6= L, tal que k empieza a procesarse en el instante Cj , y ℓ es procesado después que k. Intercambiando los trabajos j y k, i.e. adelantando k en pj unidades de tiempo y retrasando j en pk unidades de tiempo, no incrementa el costo de la solución. En efecto, el trabajo k disminuye su tiempo de completación, y por lo tanto CL′ no aumenta. La orden L tampoco aumenta su tiempo de completación puesto que el trabajo ℓ ∈ L, que esta siempre siendo procesado después de j, se mantiene sin modificar. Iterando este argumento terminamos con una programación de las tareas donde todos los trabajos de una orden son procesados de manera consecutiva. Por lo tanto, cada orden puede ser vista como un trabajo más grande con tiempo de proceso igual P P P a j∈L pj , y por ende 1|part| wL CL se reduce a 1|| wj Cj . P Ahora consideramos el problema más general de 1|| wL CL , donde permitimos que un trabajo pertenezca a más de una orden simultaneamente. Se puede probar que este problema xv es equivalente a 1|prec| P wj Cj (ver Capı́tulo 2.5.1), lo que resumimos en el siguiente teorema. Teorema 1.2. Existe un algoritmo de α-aproximación para 1|prec| P para 1|| wL CL . P wj Cj ssi existe uno P El problema con restricciones de precedencia 1|prec| wj Cj ha recibido mucho atención desde los sesenta. Lenstra y Rinnooy Kan [26] probaron que este problema es fuertemente N P-duro, incluso si los pesos o los tiempos de proceso son unitarios. Por otro lado, varios algoritmos de 2-aproximación han sido propuestos: Hall, Schulz, Shmoys y Wein [21] dieron un algoritmo basado en una relajación lineal, mientras que Chudak & Hochbaum [6] propusieron otro algoritmo de 2-aproximación basado en una relajación lineal semi-entera. Además, Chekuri y Motwani [4], y Margot, Queyranne y Wang [32] independientemente desarrollaron un algoritmo combinatorial sencillo con un factor de aproximación 2. Mas aún, los resultados P en [2, 12] implican que 1|prec| wj Cj es un caso especial de vertex cover. Por otro lado, no hubo conocimiento sobre la dificultad de aproximar este problema, hasta que recientemente Ambuhl, Mastrolilli y Svensson [3] demostraron que este problema no admite un PTAS a menos que problemas N P-duros pueden ser resueltos en tiempo aleatorio subexponencial. 1.5.2 Máquinas paralelas En esta sección hablaremos sobre problemas de programación de tareas en máquinas paralelas, donde los tiempos de proceso de cada trabajo j, están dados por pij = pj que no dependen de la máquina en donde j es procesado. Recordemos el problema previamente definido de minimizar el makespan en máquinas paralelas, P ||Cmax , que consiste en encontrar una programación de un conjunto de tareas J en un conjunto M de m máquinas paralelas, tal que se minimice el máximo tiempo de P completación. Notemos que si en nuestro problema P || wL CL el conjunto O solo contiene una orden, la función objetivo se vuelve maxj∈J Cj = Cmax y por lo tanto P ||Cmax es un P caso especial de P || wL CL , que a su vez es un caso especial de nuestro modelo más general P R|rij | wL CL . El problema de P ||Cmax es un problema clásico en el área de programación de tareas. Se puede probar fácilmente que es N P-duro, incluso con m = 2, ya que el problema de 2-partición se puede reducir a él. Por otra parte, como mostramos en el Lema 1.1, un algoritmo de tipo list-scheduling es un algoritmo de 2-aproximación. Más aun, Hochbaum y Shmoys [22] presentaron un PTAS para este problema (ver también [42, Chapter 10]). xvi Por otro lado, cuando en nuestro modelo cada orden contiene un solo trabajo, el problema P se vuelve equivalente a minimizar la suma ponderada de tiempos de completación j∈J wj Cj . P Por ende nuestro problema también generaliza P || wj Cj . El estudio de este problema también se remonta a los sesenta (ver por ejemplo [9]). Al igual con P ||Cmax , el problema es N P-duro inclusive cuando hay solo dos máquinas para procesar los trabajos. Por otra parte, una seguidilla de algoritmos de aproximación fueron propuestos hasta que Skutella y Woeginger [40] encontraron un PTAS para este problema. Más adelante, Afrati et al. [1] extendieron este resultado al caso con fechas de disponibilidad no triviales. Con esto, surge la pregunta de si existe un PTAS para P |part|wL CL (recordemos que, P como se discutió en la Sección 1.5.1, el problema levemente más general P || wL CL no tiene PTAS a menos que los problemas N P-duros puedan ser resueltos en tiempo aleatorio subexponencial). Aunque no sabemos si es que lo último es cierto, Leung, Li, y Pinedo [28] (ver también Yang y Posner [44]) presentaron una algoritmo de 2-aproximación para este problema, que es lo mejor que se conoce hasta el momento. 1.5.3 Máquinas no relacionadas En el caso más general de máquinas no relacionadas, nuestro problema también generaliza varios problemas clásicos del área de programación de tareas. Como antes, si hay una sola orden y rij = 0, nuestro problema se vuelve equivalente a R||Cmax . Lenstra, Shmoys y Tardos [27] dieron un algoritmo de 2-aproximación para R||Cmax , y mostraron que no se puede obtener un algoritmo con una garantı́a mejor que 3/2 a menos que P = N P. Luego, tenemos P el mismo resultado para nuestro problema mas general de R|| wL CL . Por otro lado, si las órdenes son singletons y los tiempos de disponibilidad triviales, i.e. P rij = 0, nuestro problema se vuelve R|| wj Cj . Como en el caso de makespan, este último problema es APX-duro [23] y por lo tanto no admite un PTAS a menos que P = N P. Sin embargo, Schulz y Skutella [35] usaron una relajación lineal para diseñar un algoritmo de (3/2 + ε)-aproximación en el caso con fechas de disponibilidad iguales a cero, y de (2 + ε)aproximación cuando las fechas de disponibilidad son no triviales. Más aún, Skutella [38] refinó este resultado usando programación cuadrática convexa, obteniendo un algoritmo de 3/2-aproximación cuando rij = 0, y de 2-aproximación con fechas de disponibilidad arbitrarias. Finalmente cabe mencionar que nuestro problema también generaliza el problema de P lı́nea de ensamblaje, A|| wj Cj , el cuál ha recibido bastante atención recientemente (ver xvii e.g. [7, 8, 30]). Una instancia de este problema consta de un conjunto de M máquinas y un conjunto de trabajos J, con pesos asociados wj . Cada trabajo j ∈ J consta de m partes, tal que la i-ésima parte de j debe ser procesada en la i-ésima máquina donde toma pij unidades de tiempo en ser procesada. El objetivo es el de minimizar la suma ponderada P de tiempos de completación, wj Cj , donde el tiempo de completación de un trabajo en este contexto se define como el instante de tiempo en que la último de sus partes termina de procesarse. Para ver que nuestro problema generaliza el de lı́nea de ensamblaje, basta hacer la correspondencia de una orden de nuestro problema con cada trabajo de la lı́nea de ensamblaje, y a sus respectivos partes con los trabajos que pertenecen a cada orden. Para asegurar que los trabajos en cada orden solo puedan ser procesados en su respectiva máquina, le asignamos un tiempo de proceso infinito (o suficientemente largo) en todas las otras máquinas. Además de probar que el problema de lı́nea de ensamblaje es N P-duro, Chen y Hall [7] y Leung, Li, y Pinedo [30] dieron, de manera independiente, un algoritmo de 2-aproximación usando un programa lineal basado en las llamadas parallel inequalities (ver también [33]). 1.6 Contribuciones de este trabajo P En esta memoria desarrollamos algoritmos de aproximación para R|rij | wL CL y algunos casos particulares de este. A continuación resumimos cada uno de los capı́tulos. 1.6.1 Capı́tulo 3: El poder de la interrumpibilidad para R||Cmax En este capı́tulo estudiamos el problema de minimizar el makespan en máquinas no relacioP nadas R||Cmax , el cual es un caso particular de nuestro problema más general R|| wL CL . Las técnicas desarrolladas en este capı́tulo darán pie a técnicas para encontrar algoritmos de P P aproximación para los casos más generales R|rij | wL CL y R|rij , pmpt| wL CL . En primer lugar revisamos el resultado de Lawler y Labetoulle [25] que muestra que el problema de R|pmpt|Cmax puede ser resuelto en tiempo polinomial. Este resultado se basa en demostrar que hay una correspondencia uno a uno entre una programación de tareas con trabajos interrumpibles y una solución de un programa lineal. Este programa lineal, que llamaremos [LL] y que está descrito a continuación, ocupa variables de asignación xij que denotan la fracción del trabajo j que es procesado en al máquina i ∈ M . xviii [LL] min C X xij = 1 para todo j ∈ J, (1.8) pij xij ≤ C para todo i ∈ M, (1.9) pij xij ≤ C para todo j ∈ J, (1.10) xij ≥ 0 para todo i, j. (1.11) i∈M X j∈J X i∈M Es claro que cada programación de tareas que interrumpe trabajos induce una solución factible de [LL]. En efecto, dada una programación de tareas que interrumpe trabajos, sea C su makespan y xij la fracción del trabajo j que es procesada en la máquina i. Luego, se debe satisfacer la Ecuación (1.8) ya que cada trabajo se procesa completamente. Más aún, la Ecuación (1.9) también se satisface ya que ninguna máquina i ∈ M puede terminar de P procesar trabajos antes de j pij xij . Similarmente, la Ecuación (1.10) es válida puesto que ningún trabajo j puede ser procesado en dos máquinas simultaneamente, y por ende el lado izquierdo de esta ecuación es una cota inferior en el tiempo de completación del trabajo j. La implicancia contraria, es decir, que toda solución de [LL] induce una programación de tareas interrumpibles con makespan C, requiere más trabajo y ocupa técnicas de emparejamientos en grafos bipartitos. A continuación proponemos como redondear cualquier solución de [LL] a una programación de tareas no interrumpibles, aumentando el makespan en un factor de a lo más 4. Juntando esto con el hecho que [LL] da una cota inferior para el problema de R||Cmax , obtenemos una algoritmo de 4-aproximación para R||Cmax . Aunque esto no mejora la 2-aproximación dada por Lenstra, Shmoys y Tardos [27] para este problema, tiene la ventaja de que la técnica P de redondeo ocupada es fácil de generalizar a nuestro problema general R|rij | wL CL . Dado x y C solucion de [LL], el redondeo consiste en lo siguiente. 1. Comenzamos por llevar a cero las variables que procesan un trabajo en una máquina que demora mucho tiempo. Más precisamente, definimos yij = 0 x si pij > 2C, ij xix si no. Con esto, ningún trabajo se encuentra parcialmente asignado a una máquina en donde demorarı́a más de 2C unidades de tiempo en ser procesado. Sin embargo, ahora los trabajos no están completamente procesados en la solución fraccionaria y. Para solucionar esto reescalamos las variables, tal que la nueva solución, x′ , satisfaga (1.8). Gracias a la ecuación (1.10), es fácil ver que al hacer esto ninguna variable aumentó más que al doble. 2. Finalmente aplicamos a la solución x′ un famoso resultado de Shmoys y Tardos [37], el cual está sintetizado en el siguiente teorema: Teorema 1.3 (Shmoys y Tardos [37]). Dada una solución fraccionaria no negativa del siguiente sistema de ecuaciones: XX j∈J i∈M cij xij ≤ C, X (1.12) para todo j ∈ J, xij = 1 i∈M (1.13) existe una solución integral x̂ij ∈ {0, 1} que satisface (1.12), (1.13), y además, xij = 0, =⇒ x̂ij = 0 X X pij x̂ij ≤ pij xij + max{pij : xij > 0} j∈J j∈J para todo i ∈ M, j ∈ J, (1.14) para todo i ∈ M. Más aún, tal solución integral puede ser encontrada en tiempo polinomial. Es sencillo probar que el algoritmo recién descrito termina con una programación de tareas de trabajos no interrumpibles con makespan a lo más 4C, donde C es el makespan de la solución fraccionaria dada por x. Finalizamos el Capı́tulo 3 demostrando que el gap de integralidad de [LL] es exactamente 4, lo que implica que la técnica de redondeo recién descrita es la mejor que se puede obtener. Para ello construimos una familia de instancias de R||Cmax , {Iβ }β<4 , tal que si CβINT denota el makespan óptimo considerando trabajos no interrumpibles, y Cβ denota el valor óptimo de [LL], entonces CβINT /Cβ ≥ β, para todo β < 4. xx 1.6.2 Capı́tulo 4: Algoritmos de aproximación para minimizar P wL CL en máquinas no relacionadas P En este capı́tulo presentamos algoritmos de aproximación para el caso general R|rij | wL CL , P además de su versión con trabajos no interrumpibles, R|rij , pmtn| wL CL . La mayorı́a de las técnicas presentadas en este capı́tulo son generalización de los métodos mostrados en el Capı́tulo 3. P Primero mostramos un algoritmo de (4+ε)-aproximación para R|rij , pmtn| wL CL . Para ello consideramos un programa lineal indexado en el tiempo, cuyas variables representan la fracción de cada trabajo que es procesada en cada instante de tiempo (discreto) en cada máquina. Este tipo de relajación lineal fue originalmente introducido por Dyer y Wolsey [13] P para el problema 1|rj | j wj Cj , y fue posteriormente extendido por Schulz y Skutella [35], quienes lo usaron para obtener algoritmos de (3/2+ε)-aproximación y de (2+ε)-aproximación P P para R|| wj Cj y R|rij | wj Cj respectivamente. La relajación lineal considera un horizonte de tiempo T , suficientemente grande tal que sea una cota superior del makespan de cualquier programación razonable, por ejemplo P T = maxi∈M,k∈J {rik + j∈J pij }. Luego dividimos el horizonte de tiempo en intervalos que crecen de manera exponencial, tal que habrán solamente una cantidad polinomial O(log T ) de intervalos. Para ello, sea ε un parámetro fijo, y sea q el primer entero tal que (1 + ε)q−1 ≥ T . Luego, consideramos los intervalos [0, 1], (1, (1 + ε)], ((1 + ε), (1 + ε)2 ], . . . , ((1 + ε)q−2 , (1 + ε)q−1 ]. Para simplificar la notación, definimos τ0 = 0, y τℓ = (1 + ε)ℓ−1 para cada ℓ = 1, . . . , q. Con esto, el ℓ-ésimo intervalo corresponde a (τℓ−1 , τℓ ]. Dado una programación de tareas con trabajos interrumpibles, sea yjiℓ la fracción del trabajo j que es procesada en la máquina i en el ℓ-ésimo intervalo. Luego, pij yjiℓ es la cantidad de tiempo que el trabajo j utiliza en la máquina i en el ℓ-ésimo intervalo. Consideremos el siguiente programa lineal. xxi [DW] min X wL CL L∈O q XX para todo j ∈ J, (1.15) pij yjiℓ ≤ τℓ − τℓ−1 para todo ℓ = 1, . . . , q y i ∈ M, (1.16) pij yjiℓ ≤ τℓ − τℓ−1 para todo ℓ = 1, . . . , q y j ∈ J, (1.17) para todo L ∈ O, j ∈ L, (1.18) yjiℓ = 0 para todo j, i, ℓ tal que rij > τℓ , (1.19) yjiℓ ≥ 0 para todo i, j, ℓ. (1.20) yjiℓ = 1 i∈M ℓ=1 X j∈J X i∈M X i∈M yji1 + q X ℓ=2 τℓ−1 yjiℓ ! ≤ CL Es sencillo de verificar que este programa lineal da una cota inferior para nuestro problema P R|rij , pmtn| wL CL . En efecto, la Ecuación (1.15) asegura que cada trabajo es completamente procesado. La Ecuación (1.16) también debe ser válida ya que en cada intervalo ℓ y máquina i la cantidad total de tiempo disponible es a lo más τℓ − τℓ−1 . Similarmente, la Ecuación (1.17) se satisface puesto que ningún trabajo puede ser procesado en dos máquinas de manera simultanea, y por lo tanto en cada intervalo ℓ la cantidad total de tiempo que se puede ocupar para procesar un trabajo es a lo más el largo del intervalo. Para ver que la Ecuación (1.18) es válida, notemos que pij ≥ 1, y por lo tanto CL ≥ 1 para todo L ∈ O. También notemos que CL ≥ τℓ−1 para todo L, j ∈ L, i, ℓ tal que yjiℓ > 0. Por ende, el lado izquierdo de la Ecuación (1.18) es una combinación convexa de valores más pequeños que CL . Finalmente, la Ecuación (1.19) es válida ya que ninguna parte de un trabajo puede ser asignada a un intervalo que termina antes de su fecha de disponibilidad, cualquiera sea la máquina. Para obtener una (4+ε)-aproximación, primero resolvemos [DW], obteniendo una solución y ∗ , {CL∗ }L∈O . Luego, procedemos de manera análoga al redondeo del Capı́tulo 3, llevando a cero todas las variables que asignan un trabajo j ∈ L a un intervalo que empiece después que 2CL∗ . Después, reescalamos las variables para asegurar que todos los trabajos estén siendo completamente procesados. Es fácil ver que al hacer esto cada variable aumenta a lo más al doble. Finalmente, las ecuaciones (1.16) y (1.17) nos permiten ocupar la técnica de Lawler y xxii Labetoulle [25] sobre cada intervalo de tiempo [τℓ−1 , τℓ ) para asegurar que ningún trabajo este siendo procesado en dos máquinas al mismo tiempo, con lo que obtenemos una programación de las tareas en donde ningún trabajo j ∈ L se procese en un instante de tiempo posterior a 4(1 + ε)CL∗ . Con esto, podemos demostrar el siguiente teorema. Teorema 1.4. Para todo ε > 0, existe un algoritmo de (4+ε)-aproximación para el problema P de R|rij , pmpt| wL CL . A continuación proponemos el primer algoritmo de aproximación a un factor constante P para el caso de trabajos no interrumpibles R|rij | wL CL . Nuestro algoritmo esta basado en un programa lineal indexado en el tiempo propuesto por Hall, Schulz, Shmoys, y Wein [21], con variables que indican en qué máquina y qué intervalo se termina de procesar cada trabajo, además de la técnica de redondeo desarrollada en el Capı́tulo 3. Al igual que antes, consideramos un horizonte de tiempo T más grande que el makespan P de cualquier programación de tareas razonable, por ejemplo T = maxi∈M,k∈J {rik + j∈J pij }. También, dividimos el horizonte de tiempo en intervalos que crecen exponencialmente en un factor 3/2, [1, 1], (1, 3/2], (3/2, (3/2)2 ], . . . , ((3/2)q−2 , (3/2)q−1 ]. Para simplificar la notación, definimos τ0 = 1, y τℓ = (3/2)ℓ−1 , para todo ℓ = 1 . . . q. Con esto, el ℓ-ésimo intervalo corresponde a (τℓ−1 , τℓ ]. Dada una programación de tareas, definimos las variables yjiℓ como uno si y solo si el trabajo j termina de procesarse en la máquina i en el ℓ-ésimo intervalo. Con esto en mente consideramos el siguiente programa lineal. xxiii [HSSW] min X wL CL L∈O q XX yjiℓ = 1 para todo j ∈ J, (1.21) pij yjis ≤ τℓ para todo i ∈ M y ℓ = 1, . . . , q, (1.22) para todo L ∈ O, j ∈ L, (1.23) yjiℓ = 0 para todo i, ℓ, j tal que pij + rij > τℓ , (1.24) yjiℓ ≥ 0 para todo i, l, j. (1.25) i∈M ℓ=1 ℓ X X s=1 j∈J q XX i∈M ℓ=1 τℓ−1 yjiℓ ≤ CL Para ver que [HSSW] es una relajación de nuestro problema, consideremos una programación de trabajos ininterrumpibles arbitrarea, y definamos yjiℓ = 1 ssi el trabajo j termina de ser procesado en la máquina i en el ℓ-ésimo intervalo. Luego, la Ecuación (1.21) es válida ya que cada trabajo termina en exactamente un intervalo y una máquina. El lado izquierdo de (1.22) corresponde a la carga total procesada en la maquina i en el intervalo [0, τℓ ]. y por lo tanto la desigualdad se satisface. La suma doble en la desigualdad (1.23) es igual a τℓ−1 , donde ℓ es el intervalo donde el trabajo j se completa, por lo que es a lo más Cj , y por ende está acotado superiormente por CL si j ∈ L. La regla (1.24) dice que algunas variables deben ser impuestas como cero antes de resolver el PL. Esto es válido ya que si pij + rij > τℓ entonces el trabajo j no podrá terminar de procesarce antes que τℓ en la máquina i, y por lo tanto yjiℓ será cero. Para obtener una solución aproximada de nuestro problema, primero resolvemos [HSSW] a optimalidad, llamando a las solución y ∗ , {CL∗ }L∈O . Luego, llevamos a cero todas las variables que asignan un trabajo j ∈ L a un intervalo posterior a 3/2CL∗ ,1 y reescalamos de manera tal que los trabajos sean totalmente asignados por la solución fraccionaria. Se puede demostrar que al hacer esto ninguna variable tuvo que incrementarse en más que un factor 3. Posteriormente aplicamos el Teorema 1.3, interpretando cada par intervalo-máquina de nuestras variables como una máquina del teorema, obteniendo ası́ una asignación entera de trabajos a pares intervalo-máquina. Podemos notar que gracias a la Ecuación (1.24) y (1.14) 1 El número 3/2 es elegido tal que el algoritmo final de el mejor factor de aproximación posible. xxiv la solución no empeora más que en un factor constante al aplicar el redondeo del Teorema 1.3. Concluimos el algoritmo asignando trabajos de manera glotona como sigue. Dentro de cada máquina, para todo ℓ = 1, . . . , q, procesamos todos los trabajos que están asignados al ℓ-ésimo intervalo lo antes posibles, y ordenando de manera arbitraria si es que habı́a mas de un trabajo asignado a un cierto intervalo. Se puede probar que al aplicar este algoritmo cada trabajo j ∈ L termina de procesarse antes de 27/2CL∗ . Obtenemos el siguiente resultado. Teorema 1.5. Existe un algoritmo de 27/2-aproximación para R|rij | 1.6.3 Capı́tulo 5: Un PTAS para minimizar paralelas P P wL CL . wL CL en máquinas P En este capı́tulo diseñamos un PTAS para algunas versiones restringidas de P |part| wL CL . Asumimos que hay un número constante de máquinas, un número constante de trabajos por orden, o un número constante de órdenes. Primero describimos el caso donde el número de trabajos por orden esta acotado por una constante K, y luego justificaremos porque esto implica la existencia de PTASes para los otros casos. Los resultados en este capı́tulo siguen P muy de cerca el PTAS para P |rj | wj Cj desarrollado por Afrati et al. [1]. Como es usual en el diseño de un PTAS, la idea general consiste en añadir estructura a la solución, modificando la instancia de tal manera que el costo de la solución óptima no empeore más que en un factor (1 + ε). Además, aplicando varias modificaciones a la solución óptima de esta nueva instancia, probaremos que existe una solución casi-óptima que satisface varias propiedades extras. La estructura otorgada por estas propiedades nos permitirán encontrar esta solución haciendo busquedas exhaustivas o programación dinámica. Como cada una de las modificaciones que aplicaremos a la solución óptima solo genera una perdida de un factor (1 + ε) al costo, podemos aplicar una cantidad constante de ellas, obteniendo una solución que está a un factor (1+ε)O(1) del costo óptimo. Luego, escogiendo ε suficientemente pequeño podemos aproximar a un factor arbitrareamente cercano a 1. Como en los capı́tulos 3 y 4, divideremos el horizonte de tiempo en intervalos que crecerán exponencialmente. Para cada entero t, denotaremos por It el intervalo [(1 + ε)t , (1 + ε)t+1 ), y llamaremos a |It | a el tamaño de tal intervalo, i.e. |It | = ε(1 + ε)t . Una de las técnicas principales que ocuparemos es la de “estiramiento”, que consiste en estirar el eje de tiempo en un factor (1 + ε). Claramente, esto solo empeora la solución en un factor de (1 + ε). Las dos técnicas básicas de estiramientos son: xxv 1. Estirar Tiempos de Completación Este procedimiento consiste en retrasar cada trabajo, tal que el tiempo de completación de un trabajo j se vuelve Cj′ = (1 + ε)Cj en la nueva programación de tareas. Es fácil de verificar que este procedimiento genera un tiempo muerto de εpj previo a cada trabajo j. 2. Estirar Intervalos: El objetivo de este procedimiento es crear tiempo muerto en cada intervalo, excepto por aquellos que tienen un trabajo que los cubren completamente. Como antes, consiste en desplazar los trabajos hasta el siguiente intervalo. Más precisamente, si el trabajo j termina en It y ocupa dj unidades de tiempo en It , moveremos j a It+1 desplazandolo en exactamente |It | unidades de tiempo, tal que ocupe dj unidades de tiempo en It+1 . Luego, el tiempo de completación de la nueva solución será a lo más (1 + ε)Cj , y por lo tanto el costo total de la solución se incrementará en a los más un factor (1 + ε). Notemos que si j parte siendo procesado en It donde es procesado por dj unidades de tiempo, después de aplicar el desplazamiento será procesada en It+1 en a lo más dj unidades de tiempo. Ya que It+1 tiene ε|It | = ε2 (1 + ε)t más unidades de tiempo que It , al menos esa cantidad de tiempo muerto será creado en It+1 . Además, podemos asumir que este tiempo muerto es consecutivo en cada intervalo. En efecto, esto se puede lograr moviendo a la izquierda lo más posible todos los trabajos que son programados completamente dentro de un intervalo. Antes de dar una descripción general del algoritmo, presentamos un teorema que asegura la existencia de una solución (1 + ε)-aproximada donde ninguna orden cruza más que O(1) intervalos. Para esto, primero mostramos la siguiente propiedad básica, la cuál está planteada en el caso más general de máquinas no relacionadas. P Lema 1.6. Para cualquier instancia de R|part| wL CL existe una solución óptima tal que: 1. Para cada orden L ∈ O y para cada máquina i = 1, . . . , m, todos los trabajos en L asignados a la máquina i son procesados de manera consecutiva. 2. La secuencia en la cual las órdenes son procesadas en cada máquina es independiente de la máquina. Lema 1.7. Sea s := ⌈log(1 + 1/ε)⌉, luego existe una programación de tareas (1 + ε)aproximada en la cual cada orden es completamente procesada en a lo más s + 1 intervalos consecutivos xxvi En lo que sigue describimos la idea general del PTAS. Dividimos el horizonte de tiempo en bloques de s+1 = ⌈log(1+1/ε)⌉+1 intervalos, y denotemos por Bℓ el bloque [(1+ε)ℓ(s+1) , (1+ ε)(ℓ+1)(s+1) ). El Lema 1.7 sugiere optimizar cada bloque por separado, y posteriormente juntar las soluciones de cada bloque para construir la solución global. Ya que pueden haber órdenes que cruzan de un bloque al siguiente, será necesario perturbar la “forma” de los bloques. Para ello introducimos el concepto de “frontera”. La “frontera saliente” de un bloque Bℓ es un vector con m entradas, tal que su i-ésima coordenada contiene el tiempo de completación del último trabajo procesado en la máquina i entre los trabajos pertenecientes a órdenes que comienzan a procesarse en Bℓ . Por otro lado, la “frontera entrante” de un bloque es la frontera saliente del bloque anterior. Dado un bloque, una frontera entrante y una frontera saliente, diremos que una orden es procesada dentro del bloque Bℓ si en cada máquina todos los trabajo en esa orden empiezan a procesarse después de la frontera entrante y terminan de procesarse antes de la frontera saliente. Asumamos momentaneamente que sabemos como calcular una solución (1+ε)-aproximada para un subconjunto dado de órdenes V ⊆ O dentro de un bloque Bℓ , con fronteras entrante y saliente F ′ y F respectivamente. Sea W (ℓ, F ′ , F, V ) el costo (suma ponderada de tiempos de completación de órdenes) de esta solución. Sea Fℓ el conjunto de posibles fronteras entrantes del bloque Bℓ . Usando programación dinámica podemos llenar una tabla T (ℓ, F, U ) que contiene el costo de una solución casióptima para el subconjunto de órdenes U ⊆ O en el bloque Bℓ o antes, respetando la frontera saliente, F , de Bℓ . Para calcular esta cantidad podemos usar la siguiente formula recursiva: T (ℓ + 1, F, U ) = min F ′ ∈Fℓ ,V ⊆U {T (ℓ, F ′ , V ) + W (ℓ + 1, F ′ , F, U \ V )}. Desafortunadamente, la tabla T no contiene una cantidad polinomial de entradas, ni siquiera finita. Luego, es necesario reducir su tamaño de la misma manera que en [1]. Con esto en mente el esquema del algoritmo es como sigue. Algoritmo: PTAS-DP 1. Localización: En este paso acotamos el perı́odo de tiempo en el cual cada orden puede ser procesada. Damos estructura extra a la instancia, definiendo un instante de disponibilidad rL para cada orden L, tal que existe una solución (1 + ε)-aproximada donde cada orden comienza a procesarse después de rL y termina de procesarse antes de un cierto número constante de intervalos después de rL . Esto juega un rol crucial en el próximo paso. xxvii 2. Representación polinomial de subconjuntos de órdenes: El objetivo de este paso es el de reducir el número de subconjuntos de órdenes que necesitamos considerar en la programación dinámica. Para ello, para todo ℓ definimos un subconjunto de tamaño polinomial Θℓ ⊆ 2O de posibles subconjuntos de órdenes que son procesadas en Bℓ o antes en alguna solución casi-óptima. 3. Representación polinomial de fronteras: En este paso reducimos el número de fronteras que debemos considerar en la programación dinámica. Para cada ℓ, encontramos Fbℓ ⊂ Fℓ , un conjunto de tamaño polinomial tal que en cada bloque la frontera saliente en una solución casi-óptima pertenece a Fbℓ . 4. Programación dinámica: Para todo ℓ, F ∈ Fbℓ+1 , U ∈ Θℓ calculamos: T (ℓ, F, U ) = min bℓ ,V ⊆U,V ∈Θℓ−1 F ′ ∈F {T (ℓ − 1, F ′ , V ) + W (ℓ, F ′ , F, U \ V )}. Es claro que no es necesario calcular exactamente W (ℓ, F ′ , F, U \ V ); una (1 + ε)aproximación de este valor, que mueve la frontera en a lo más un factor (1 + ε), es suficiente. Para calcular esto, particionamos las órdenes en pequeñas y grandes. Para los órdenes grandes usamos enumeración, y esencialmente tratamos cada posible programación de tareas, mientras que para las órdenes pequeñas las procesamos de manera glotona. Una de las mayores dificultades de este enfoque es que todas las modificaciones aplicadas a la solución óptima deben conservar las propiedades dadas por el Lema 1.6. Esto es necesario para describir la interacción entre un bloque y el siguiente usando solo el concepto de frontera. En otras palabras, si esto no fuera cierto podrı́a pasar que algún trabajo de una orden que comienza a procesarse en un bloque Bℓ sea procesado después de un trabajo que pertenece a una orden que comienza en el bloque Bℓ+1 . Esto incrementarı́a la complejidad del algoritmo, ya que esta interacción tendrı́a que ser considerada en la programación dinámica. Esta es la principal razón por la cuál nuestro resultado no se generaliza de manera directa al caso en donde tenemos tiempos de disponibilidad no triviales, ya que en este caso el Lema 1.6 no se satisface. Aplicando cuidadosamente estas ideas, se pueden concluir los siguientes teoremas. Teorema 1.8. Algoritmo: PTAS-DP es un esquema de aproximación a tiempo polinomial P para P |part| wL CL cuando el número de trabajos por orden esta acotado por una constante xxviii K. Teorema 1.9. Algoritmo: PTAS-DP es un esquema de aproximación a tiempo polinomial P para P m|part| wL CL . Teorema 1.10. Algoritmo: PTAS-DP es un esquema de aproximación a tiempo polinoP mial para P |part| wL CL cuando el número de órdenes es constante. 1.7 Conclusiones En esta memoria estudiamos problemas de programación de tareas con el objetivo de minimizar la suma ponderada de tiempos de completación de órdenes. En el Capı́tulo 3 comenzamos estudiando el caso particular de minimizar el makespan en máquinas no relacionadas. Mostramos como una simple técnica de redondeo puede transformar una solución con trabajos interrumpibles a una en donde ningún trabajo es interrumpido, tal que el makespan aumenta en a lo más un factor 4. Luego, probamos que esta resultado es lo mejor que se puede alcanzar, por medio de construir una familia de instancias que casi alcanzan esta cota. P En el Capı́tulo 4 presentamos algoritmos de aproximación para R|rij | wL CL y su verP sión con trabajos interrumpibles R|rij , pmpt| wL CL . Ambos algoritmos son basados en una técnica de redondeo muy similar a la desarrollada en el Capı́tulo 3 para minimizar el makespan. Además, cada algoritmo es el primero en tener un factor de aproximación constante para cada problema. Sin embargo, todavia quedan varias preguntas abiertas. En primer lugar, podriamos preguntarnos si es que las técnicas de redondeo ocupadas en cada algoritmo pueden ser mejoradas. A primera vista el paso que parece más factible de mejorar corresponde a cuando los valores de y son llevados a cero si es que asignan un trabajo a un intervalo muy tardı́o. Aunque no es una demostración, en el Capı́tulo 3 mostramos que una técnica muy similar da un redondeo que no puede ser mejorado. P Recordemos que el mejor resultado sobre la dificultad de aproximar R|| wL CL deriva del hecho que es N P-duro aproximar R||Cmax a un factor mejor que 3/2. Considerando que el algoritmo dado en este escrito asegura una garantı́a de 27/2, serı́a interesante el disminuir esta diferencia. Dado la generalidad de nuestro modelo, parece ser más fácil hacer esto dando una reducción diseñada especı́ficamente para nuestro problema, demostrando que nuestro problema es N P-duro de aproximar a un factor α > 3/2. P En el Capı́tulo 5 dimos un PTAS para P |part| wL CL , cuando el número de trabajos por orden, el número de órdenes o el número de máquinas son constantes. Esto generaliza varios xxix P PTASes previamente conocidos, como por ejemplo los PTAS para P ||Cmax y P || wj Cj . Sin P embargo, serı́a interesante el responder la pregunta de si el caso más general P |part| wL CL es APX-duro o no. Finalmente, otra posible dirección para continuar esta investigación es el de considerar nuestro problema en el caso en lı́nea. En esta variante las órdenes llegan a través del tiempo, y ningún tipo de información es conocida sobre ellas antes de su fecha de disponibilidad. En problemas en lı́nea estamos interesados en comparar el costo de nuestra solución con la solución óptima del caso en donde toda la información es conocida desde el tiempo 0. Con este objetivo, la noción de α-points (ver por ejemplo [18, 5, 35, 10]) ha demostrado ser útil para el problema de minimizar la suma ponderada de tiempos de completación de trabajos, y por ende serı́a interesante el estudiar está técnica para nuestro caso más general en la presencia de órdenes. xxx Chapter 2 Introduction 2.1 Machine scheduling problems Machine scheduling problems deal with the allocation of scarce resources over time. They arise in several and very different situations, for example, a construction site where the boss has to assign jobs to each worker, a CPU that must process tasks asked by several users, or a factory’s production lines that must manufacture products for its clients. In general, an instance of a scheduling problem contains a set of n jobs J, and a set of m machines M where the jobs in J must be processed. A solution of the problem is a schedule, i.e., an assignment that specifies when and on which machines i ∈ M each job j ∈ J is executed. To classify scheduling problems we have to look at the different characteristics or attributes that the machines and jobs have, as well as the objective function to be optimized. One of these is the machine environment, or the characteristics of the machines on our model. For example, we can consider identical or parallel machines, where each machine is an identical copy of all the others. In this setting each job j ∈ J takes a time pj to be processed, independent of the machine in which is scheduled. On the other hand, we can consider a more general situation where each machine i ∈ M has a different speed si , and then the time that takes to process job j on it is inversely proportional to the speed of the machine. Additionally, scheduling problems can be classified depending on job’s characteristics. Just to name a few, our model may consider nonpreemptive jobs, i.e. jobs cannot be interrupted until they are completed, or preemptive jobs, i.e. jobs that can be interrupted at any time and later resumed on the same or in a different machine. 1 Also, we can classify problems depending on the objective function. One of the more naturals objective functions is to minimize the makespan, i.e., to minimize the point in time at which the last job finishes. More precisely, if for some schedule we define the completion time of a job j ∈ J, denoted as Cj , as the time where job j ∈ J finishes processing, then the objective is to minimize Cmax := maxj∈J Cj . Other classical example consists on minimizing the number of late jobs. In this setting, each job j ∈ J has a deadline dj and the objective is to minimize the number of jobs that finish processing after its deadline. As these, there are several other different objective functions that can be considered. A large amount of scheduling problems can be consired by combining the characteristics just mentioned. So, it becomes necessary to introduce a standard notation for all these different problems. For this, Grahams, Lawler, Lenstra and Rinnooy Kan [20], introduced the “three field notation”, where a scheduling problem is represented by an expression of the form α|β|γ. Here, the first field α denotes the machine environment, the second field β contains extra constrains or characteristics of the problem, and the last field γ denotes the objective function. In the following we describe the most common values for α, β and γ. 1. Values of α. • α = 1 : Single Machine. There is only one machine at our disposal to process the jobs. Each job j ∈ J takes a given time pj to be processed. • α = P : Parallel Machines. We have a number m of identical or parallel machines to process the jobs. Then, the processing time of job j is given by pj , independently of the machine where job j is processed. • α = Q: Related Machines. In this setting each machine i ∈ M has a speed si associated. Then, the processing time of job j ∈ J on machine i ∈ M equals pj /si , where pj is the time it takes to process j in a machine of speed 1. • α = R: Unrelated Machines. In this more general setting there is no a priori relation between the processing times of jobs on each machine, i.e., the processing time of job j ∈ J on machine i ∈ M is an arbitrary number denoted by pij . Additionally, in the case that α = P, Q or R, we can add the letter m at the end of the field indicating that the number of machines m is constant. Then, for example, if under a parallel machine environment the number of machines is constant, then α = P m. The value of m can also be specified, e.g., α = P 2 means that there are exactly 2 parallel machines to process the jobs. 2 2. Values of β. • β = pmtn: Preemptive Jobs. In this setting we consider jobs that can be preempted, i.e., jobs that can be interrupted and resume later on the same or on a different machine. • β = rj : Release Dates. Each job j ∈ J has associated a release date rj , such that j cannot start processing before that time. • β = prec: Precedence Constrains. Consider a partial order relation over the jobs (J, ≺). If for some pair of jobs j y k, j ≺ k, then k must start processing after the completion time of job j. 3. Values of γ. • γ = Cmax : Makespan. The objective is to minimize the makespan Cmax := maxj∈J Cj . P • γ = Cj : Average Completion Times. We must minimize the average of the P completion times, or equivalently j∈J Cj . P • γ = wj Cj : Sum of weight Completion Times. Consider a weight wj for each j ∈ J. Then, the objective is to minimize the sum of weighted completion time P j∈J wj Cj . It is worth noticing that by default we consider nonpreemptive jobs. In other words, P if the field β is empty, then jobs cannot be preempted. For example, R|| wj Cj denotes the problem of finding a nonpreemptive schedule of a set of jobs J on a set of machines M , where each job j ∈ J takes pij units of time to process in machine i ∈ M , minimizing P P wj Cj denotes the same problem as before, with j∈J wj Cj . As a second example, R|rj | the only difference that a job j can only start processing after rj . Also, note that the field β P can take more than just one value. For example, R|prec, rj | wj Cj is the same as the last problem, but adding precedence constrains. Over all scheduling problems, most non-trivial ones are N P-hard and therefore there is no polynomial time algorithm to solve them unless P = N P. In particular, as we will show later, one of the fundamental problems in scheduling, P 2||Cmax , can be easily proven N P-hard. In the following section we describe some general techniques to address N P-hard optimization problems and some basic applications to scheduling. 3 2.2 Approximation algorithms The introduction of the N P-complete class given by Cook [11], Karp [24] and independently Levin [31], left big challenges about how these problems could be tackle given their apparent intractability. One option that has been widely studied is the use of algorithms that completely solves the problem, but has no polynomial upper bound on the running time. This kind of algorithm can be useful in small to medium instances, or in instances with some special structure where the algorithm runs fast enough in practice. Nevertheless, there may be other instances where the algorithm takes exponential time to finish, becoming impractical. The most commons of this approaches are Branch & Bound, Branch & Cut and Integer Programming techniques. For the special case of N P-hard optimization problems, another alternative is to use algorithms that runs in polynomial time, but may not solve the problem to optimality. Among this kind of algorithms, a particularly interesting class is “approximation algorithms”, i.e., algorithms in which the solution is guaranteed to be, in some sense, close to the optimal solution. More formally, let us consider a minimization problem P with cost function c. For α ≥ 1, we say that a solution S to P is an α-approximation if it cost c(S) is within a factor α from the cost of the optimal OP T , i.e., if c(S) ≤ α · OP T. (2.1) Now, consider a polynomial-time algorithm A whose output over instance I is A(I). Then, A is an α-approximation algorithm if for any instance I, A(I) is an α-approximation. The number α is called the approximation factor of algorithm A, and if α does not depends on the input we say the A is a constant factor approximation algorithm. Analogously, if P is a maximization problem with objective function c, a solution S is an α-approximation, for α ≤ 1, if c(S) ≥ α · OP T. As before, for α ≤ 1, an algorithm A is an α-approximation algorithm if A(I) is an αapproximation for any instance I. On the remaining of this document we will only study minimization problems, and therefore we will not use this definition. One of the firsts approximation algorithm for an N P-hard optimization problem was presented by R.L. Graham [19] in 1966, even before the notion of N P-completeness was 4 formally introduced. Graham studied the problem of minimizing the makespan on parallel machines, P ||Cmax . He proposed a greedy algorithm consisting on: (1) Order the jobs arbitrarily, (j1 , . . . , jn ); (2) For k = 1, . . . , n, schedule job jk on the machine where it would begin processing first. Such a procedure is called a list-scheduling algorithm. Lemma 2.1 (Graham 1966 [19]). List-scheduling is a (2 − 1/m)-approximation algorithm for P ||Cmax . Proof. First notice that if OP T denotes the makespan of the optimal solution, then OP T ≥ 1 X pj , m j∈J (2.2) since otherwise the total amount of machine time needed to process all jobs would be less P than j∈J pj . Let ℓ be such that Cjℓ = Cmax , and denote Sj = Cj − pj the starting time of a job j ∈ J. Then, noting that at the ℓ-th step of the algorithm all machines were busy at time Sjℓ , ℓ−1 1 X pj , S jℓ ≤ m k=1 k and therefore, ℓ Cmax 1 X 1 = S jℓ + p jℓ ≤ pjk + (1 − )pjℓ ≤ m k=1 m 1 2− m OP T, (2.3) where the last inequality follows from (2.2) and the fact that pjℓ ≤ OP T , since no schedule can finish before pj for any j ∈ J. As we could observe, a crucial step in the previous analysis is to obtain a good lower bound on the optimal solution (for example Equation (2.2) in last lemma), to then use it to upper bound the solution given by the algorithm (as in Equation (2.3)). Most techniques to find lower bounds are problem specific, and therefore is hard to give general rules of how to find them. One of the few exceptions that has been proven useful in a widely variety of problem, consists on formulating the optimization problem as a integer program, and later relax its integrality constrains. Clearly, the optimal solution of the relaxed problem must be a lower bound on the optimal solution of the original problem. An algorithm that uses this technique is called a LP-based approximation algorithm. To illustrate this idea, consider the following problem. 5 Minimum Cost Vertex-Cover: Input: A graph G = (V, E), and a cost function c : V → Q over the vertices. Objective: Find a vertex-cover, i.e., a set B ⊆ V that intersects every edge in E, P minimizing the cost c(B) = v∈B c(v). It is easy to see that this problem is equivalent to the following integer program: [LP] min X yv c(v) (2.4) v∈V yv + yw ≥ 1 for all vw ∈ E, (2.5) for all v ∈ V. (2.6) yv ∈ {0, 1} Therefore, by replacing Equation (2.6) by yv ≥ 0, we obtain a linear program whose optimal value is a lower bound on the optimal of the Minimum Cost Vertex-Cover problem. To get a constant factor approximation algorithm, we proceed as follows. First solve [LP] (by, for example, using the ellipsoid method), and call the solution yv∗ . To round this fractional solution first note that Equation (2.5) implies that for every edge vw ∈ E either yv∗ ≥ 1/2 or yw∗ ≥ 1/2. Then, the set B = {v ∈ V |yv∗ ≥ 1/2} is a vertex-cover, and furthermore we can bound its cost as, c(B) = X v:yv∗ ≥1/2 c(v) ≤ 2 X v∈V yv∗ c(v) ≤ 2OP TLP ≤ 2OP T, (2.7) where OP T denotes the cost of the optimum solution of the vertex-cover problem and OP TLP is the solution of [LP]. Thus, the algorithm just described is a 2-approximation algorithm. Noting that OP T ≤ c(B), Equation (2.7) implies that OP T ≤ 2, OP TLP for any instance I of the Minimum Cost Vertex-Cover. More generally, any α- approximation algorithm that uses OP TLP as a lower bound must satisfy max I OP T ≤ α. OP TLP The left hand side of this last equation is called the integrality gap of the linear program. 6 Finding a lower bound on the integrality gap is a common technique to see what is the best approximation factor that a linear program can yield. To do this we just need to find a instance with a large ratio OP T /OP TLP . For example, is easy to show that the rounding we just described for Minimum Cost Vertex-Cover is best possible. Indeed, considering the graph G as the complete graph of n vertices and the cost function c ≡ 1, we get that OP T = n − 1 and OP TLP = n/2, and thus OP T /OP TLP → 2 when n → ∞. 2.3 Polynomial time approximation schemes For a given N P-hard problem, it is natural to ask what is the best possible approximation algorithm in term of its approximation factor. Clearly, this depends on the problem. On one side, there are some problems that do not admit any kind of approximation algorithms unless P = N P. For example, the travelling salesman problem with binary costs cannot be approximated up to any factor. Indeed, if there exists an α-approximation algorithm for this problem, then we can use it to decide whether exists or not a hamiltonian circuit of cost zero: If the optimum solution is zero, then the approximation algorithm must return zero by (2.1), independently of the value of α; If the optimum solution is greater than zero then the algorithm will also return a solution with cost greater than zero. On the other hand, there are some problems that admit arbitrarily good approximation algorithms. To formalize this idea we define a polynomial time approximation scheme (PTAS) as a collection of algorithms {Aε }ε>0 such that each Aε is a (1 + ε)-approximation algorithm that runs in polynomial time. Let us remark that ε is not considered as part of the input, and therefore the running time of the algorithm could depend exponentially on ε. A common technique to find a PTAS is to “round” the instance such that the solution space is significantly decreased, but the value of the optimal solution is only slightly changed. Later, we can use exhaustive search or dynamic programming to find an optimal or nearoptimal (i.e. a (1 + ε)-approximation) solution to the rounded problem. To obtain an almost-optimal solution to the original problem, we transform the solution of the rounded instance without increasing the cost in more than a 1 + O(ε) factor. We briefly show this technique by applying it to P 2||Cmax , i.e. the problem of minimizing the makespan on two parallel machines. Consider a fixed 0 < ε < 1, and call OP T the makespan of the optimal solution. We will show how to find a schedule of makespan less than (1+ε)2 OP T ≤ (1+3ε)OP T , which is enough by redefining ε ← ε/3. Begin by rounding 7 up the values of each pj to powers of (1 + ε), pj ← (1 + ε)⌈log1+ε pj ⌉ . With this, the processing time of each job is increased in at most a (1+ε) factor, and so is the optimal makespan. In other words, by denoting OP Tr the optimal makespan of the rounded instance, OP Tr ≤ (1 + ε)OP T . Then, it would be enough to find a (1 + ε)-approximation of the rounded instance, since using that assignment of jobs to machines on the original problem would only decreases the makespan of the solution, thus yielding a (1 + ε)2 -approximation. For this, let P = maxj pj , and define a job to be “big” if pj ≥ εP and “small” otherwise. Thanks to our rounding, the amount of different values the processing time of a big job can take is less than ⌊log1+ε 1/ε⌋+1 = O(1). Also, notice that a schedule of big jobs is determined by specifying how many jobs of each size are assigned to each of the two machines. Thus, we can enumerate all schedules of big jobs in time n⌊log1+ε 1/ε⌋+1 = nO(1) = poly(n), and take the one with the shortest makespan. To schedule small jobs, notice that a list-scheduling algorithm is enough: process each job one step at a time, in any order, on the machine that would finish first. Clearly, this yields a (1 + ε)-approximation for the rounded instance. Indeed, if after adding the small jobs the makespan was not increased, then the solution constructed is optimal. On the other hand, if adding the small jobs increased the makespan, then the difference between the makespan of both machines is less than εP ≤ εOP Tr . Therefore, the makespan of the solution constructed is less than (1 + ε)OP Tr ≤ (1 + ε)2 OP T . Thus, we can construct a (1 + ε)2 -approximation of the original problem in polynomial-time. Although the algorithm that we just showed runs in polynomial-time for any fixed ε, the running time increases exponentially when ε decreases. Thus, we may ask if we can do even better, e.g., if we can find a PTAS for which the running time is also polynomial in ε. Such a scheme is called a fully polynomial time approximation schemes(FPTAS). Unfortunately, there are only few problems that admits an FPTAS. Indeed, it can be shown that any strongly N P-hard problem cannot admit a FPTAS, unless P = N P (see for example [42] Ch. 8). In the next section we will describe the problem that we are going to work on this thesis. Not surprisingly the problem is N P-hard, and thus the tools discussed in this and in the previous sections will be helpful to study it. 8 2.4 Problem definition In this writing we study a natural scheduling problem arising in manufacturing environments. Consider a setting where clients place orders, consisting of one or more products, to a given manufacturer. Each product has a machine dependant processing requirement, and has to be processed on any of m machines available for production. The manufacturer has to find a schedule so as to give the best possible service to its clients. In its most general form, the problem we consider is as follows. We are given a set of jobs S J and a set of orders O ⊆ P(J), such that L∈O L = J. Each job j ∈ J is associated with a value pij which represents its processing time on machine i, while each order L has a weight factor wL depending on how important it is for the manufacturer. Also, job j is associated with a machine dependant release date rij , so it can only start being processed on machine i by time rij . An order is completed once all its jobs have been processed. Therefore, if Cj denotes the point in time at which job j is completed, CL = max{Cj : j ∈ L} denotes the completion time of order L. The goal of the manufacturer is to find a nonpreemptive schedule in the m available machines so as to minimize the sum of weighted completion time of orders, i.e., X min wL CL . L∈O We refer to this objective function as the sum of weighted completion time of orders. Let us remark that in this general framework we are not restricted to the case where the orders are disjoint, and therefore one job may participate in the completion time of several orders. P To adopt the three field scheduling notation we denote this problem as R|rij | wL CL , or P R|| wL CL , in case all release dates are zero. When the processing times pij do not depend on the machine, we exchange the “R” by a “P ”. Also, when we impose the additional constraint that orders are disjoint subsets of jobs we will add part in the second field β of the notation. As will be showed later, our problem generalizes several classic machine scheduling probP P lems. Most notably, these include R||Cmax , R|rij | wj Cj and 1|prec| wj Cj . Since all of this are N P-hard in the strong sense (see for example [17]), then our more general setting also is. It is somewhat surprising that the best known approximation algorithms for all these problems have an approximation guarantee of 2 [4, 35, 37]. However, for our more general setting, no constant factor approximation is known. The best known result, due to Leung, Li, Pinedo and Zhang [29], is an algorithm for the special case of related machines (i.e., 9 pij = pj /si , where si is the speed of machine i) and without release dates on jobs. The approximation factor of the algorithm is 1 + ρ(m − 1)/(ρ + m − 1), where ρ is the ratio of the speed of the fastest machine to that of the slowest machine. In general this guarantee is not constant and can be as bad a m/2. 2.5 Previous work To illustrate the flexibility of our model, we now review some relevant scheduling models in different machine environments that lie in our framework. 2.5.1 Single machine We begin by considering the problem of minimizing the sum of weighted completion time of orders on one machine. First we study the simply case where no job belongs to more P P than one order, 1|part| wL CL , showing that is equivalent to 1|| wj Cj . The later, as was shown by Smith [41], can be solved to optimality by scheduling jobs in non-increasing order of wj /pj . In the literature, this greedy algorithm is known as Smith’s rule. To see that the these two problems are indeed equivalent, we first show that there is a P optimal schedule of 1|part| wL CL where all jobs of an order L ∈ O are processed consecutively. To see this, consider an optimal schedule where this does not hold. Then, there exist jobs j, ℓ ∈ L and k ∈ L′ 6= L, such that k starts processing at Cj , and ℓ is processed after k. Thus, swapping jobs j and k, i.e. delaying j by pk units of time and bringing forward k by pj units of time, does not increase the cost of the solution. Indeed, job k decreases its completion time, and so CL′ is not increased. Also, order L does not increase its completion time since job ℓ ∈ L, which is always processed after j, remains untouched. By iterating this argument, we finish with a schedule where all jobs in an order are processed consecutively. P Therefore, each order can be seen as a larger job with processing time j∈L pj , and thus our P problem is equivalent to 1|| wj Cj . P We now consider the more general problem 1|| wL CL , where we allow jobs to belong to several orders at the same time. We will prove that this problem is equivalent to single P machine scheduling with precedence constraints denoted by 1|prec| wj Cj . Recall that in this problem there is a partial order over the jobs meaning that, if j k, then job j must finish being processed before job k begins processing. If j k we say that j is a predecessor of k and k is a successor of j. This classic scheduling problem has attracted 10 much attention since the sixties. Lenstra and Rinnooy Kan [26] showed that this problem is strongly N P-hard even with unit weights or unit processing times. On the other hand, several 2-approximation algorithms have been proposed: Hall, Schulz, Shmoys & Wein [21] gave a LP-relaxation based 2-approximation, while Chudak & Hochbaum [6] proposed another 2-approximation based on a half-integral programming relaxation. Also, Chekuri & Motwani [4], and Margot, Queyranne & Wang [32] independently developed a very simple P combinatorial 2-approximation. Furthermore, the results in [2, 12] imply that 1|prec| wj Cj is a special case of vertex cover. However, hardness of approximation results where unknown until recently Ambuhl, Mastrolilli & Svensson [3] proved that there is no PTAS for this problem unless N P-hard problems can be solved in randomized subexponential time. P P We now show that 1|| wL CL and 1|prec| wj Cj are equivalent and therefore all results P known for the latter can be also be applied to 1|| wL CL . First, let us see that every αP P approximation for 1|prec| wj Cj implies an α-approximation for 1|| wL CL . Let I = (J, O) P be an instance of 1|| wL CL , where J is the job set and O the set of orders. We construct P an instance I ′ = (J ′ , ) of 1|prec| wj Cj as follows. For each job j ∈ J there is a job j ′ ∈ J ′ with pj ′ = pj and wj ′ = 0. Also, for every order L ∈ O we will consider an extra job j(L) ∈ J ′ with processing time pj(L) = 0 and weight wj(L) = wL . The only precedence constrains that we will impose will be that j ′ j(L) for all j ∈ L and every L ∈ O. Since pj(L) = 0, we can restrict ourselves to schedules of I ′ where each j(L) is processed when the last job of L is completed. Thus, it is clear that the optimal solutions to both problems have the same total P cost. Furthermore, it is straightforward to note that given an algorithm for 1|prec| wj Cj (approximate or not) we can simply apply it to instance I ′ above and impose that j(L) is processed exactly when the last job of L is completed, without a cost increase. The resulting P schedule for I ′ can then be directly applied to the original instance I of 1|| wL CL and its cost will remain the same. P To see the other direction, let I = (J, ) be an instance of 1|prec| wj Cj . To construct P an instance I ′ = (J ′ , O) of 1|| wL CL , consider the same set of jobs J ′ = J and for every job j ∈ J ′ , we let L(j) ∈ O be the order {k ∈ J : k j}, and let wL(j) = wj . With this construction the following lemma holds. Lemma 2.2. Any schedule of I ′ can be efficiently transformed into a schedule of the same instance, respecting the underlying precedence constraints and without increasing the cost. Proof. Let k be the last job that violates a precedence constrain, and let j be the last job that is a successor of k but is scheduled before k. We will show that delaying job j right after 11 j 111111111111 000000000000 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 11111111111 00000000000 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 k k j Figure 2.1: Top: Original schedule. Bottom: Schedule after delaying j. job k (see Figure 2.1) does not violate any new precedence constrain, and does not increase the total cost. Indeed, if moving j after k violates a precedence constrains then there exists a job j ′ that was originally processed between j and k, such that j j ′ . Thus k j ′ , contradicting the choice of j and k. Also, note that every job but j diminishes its completion time. Furthermore, the completion time of each order containing j is not increased, since each such order also contained job k and the completion time of j in the new schedule will be the same as the completion of k in the old schedule. P With this lemma we conclude that the optimal schedule for instance I of 1|prec| wj Cj P has the same cost as that for instance I ′ of 1|| wL CL . Moreover, any α-approximate P schedule for instance I ′ of 1|| wL CL can be transformed into a schedule for instance I of P 1|prec| wj Cj of the same cost. Thus, the following holds. P P Theorem 2.3. The approximability thresholds of 1|prec| wj Cj and 1|| wL CL coincide. 2.5.2 Parallel machines In this section we talk about scheduling on parallel machines, where the processing time of each job j, pij = pj does not depend on the machine where is processed. Recall the previously defined problem of minimum makespan scheduling on parallel machines, P ||Cmax , which consists in finding a schedule of n jobs in m parallel machines, so as to minimize the maximum completion time. Notice that if in our setting O only contains one order, then the objective function becomes maxj∈J Cj = Cmax , and therefore P ||Cmax is P a special case of P || wL CL , which at the same time is a special case of our more general P model R|rij | wL CL . 12 The problem P ||Cmax has been a classical machine scheduling problem. It can be easily proven N P-hard, even for 2 machines. Indeed, consider the 2Partition problem where, for a given multiset of positive integers S = {a1 , . . . , an }, we must decide whether exists a S P P P partition R, T ⊆ A, R · T = A such that j∈S aj = j∈R aj = 1/2 j∈A aj . Then, for a given multiset S, consider n jobs where job j = 1, . . . , n has processing time pj = aj . Then, finding the minimum makespan schedule on two parallel machines would let us solve P 2Partition: the minimum makespan equals 1/2 j∈J pj if and only if there exist sets S J1 , J2 ⊆ J, J1 · J2 = J, corresponding to the set of jobs processed in each machine, such P P P that j∈J1 pj = j∈J2 pj = 1/2 j∈J pj . And thus, since 2Partition is N P-complete [24, 17], we conclude that P 2||Cmax is N P-hard. On the other hand, as showed in Lemma 2.1, a list-scheduling approach yields a 2-approximation algorithm. Furthermore, Hochbaum and Shmoys [22] presented a PTAS for the problem (see also [42, Chapter 10]). On the other hand, when on our model each order only contains one job, the problem P becomes equivalent to minimize the sum of weighted completion times of jobs j∈J wj Cj . Thus, in this case, the parallel machine version of our problem with no release dates becomes P P || wj Cj . The study of this problem also goes back to the sixties (see [9] for an early treatment). As in the makespan case, the problem becomes N P-hard already for two machines. On the other hand, a sequence of approximation algorithms had been proposed until Skutella and Woeginger [40] found a PTAS for the problem. Later, Afrati et al. [1] extended this result for the case on non-trivial release dates. P A natural question is thus to ask if there exists a PTAS for P |part| wL CL exists (notice P that, as shown in Section 2.5.1, the slightly more general problem P || wL CL is unlikely to have a PTAS). Although we do not know whether the latter holds, Leung, Li, and Pinedo [28] (see also Yang and Posner [44]) presented a 2-approximation algorithm for this problem. We briefly give an alternative analysis of Leung et al.’s algorithm by using a classic linear programming framework, first developed by Queyranne [33] for the single machine problem. Let Mj be the midpoint of job j in a given schedule, in other words, Mj = Cj − pj /2. Eastman et al. [15] implicitly showed that for any set of jobs S ⊆ J and any feasible P 2 schedule in m parallel machines, then the inequality: j∈S pj Mj ≥ p(S) /2m is satisfied, P where p(S) = j∈S pj . These inequalities are called the parallel inequalities. It follows that if OPT denotes the value of an optimal schedule, then OPT is lower bounded by the following linear program: X [LP] min wL CL L∈O 13 X j∈S CL ≥ Mj + pj /2 for all L ∈ O and j ∈ L, pj Mj ≥ p(S)2 /2m for all S ⊆ N. Queyranne [33] showed that [LP] can be solved in polynomial time since separating the parallel inequalities reduces to submodular function minimization. Let M1∗ , . . . , Mn∗ be an optimal solution and assume without loss of generality that M1∗ ≤ M2∗ ≤ · · · ≤ Mn∗ . Clearly, CL∗ = max{Mj∗ + pj /2 : j ∈ L}, so the optimal solution is completely determined by the M values. Consider the algorithm that first solves (LP) and then schedules jobs using a list-scheduling algorithm according to the order M1∗ ≤ M2∗ ≤ · · · ≤ Mn∗ . Let CjA denote the completion time of job j in the schedule given by the algorithm, so that CLA = max{CjA : j ∈ L}. It is easy to see that CjA equals the time at which job j is started by the algorithm, SjA , plus pj . Furthermore, at any point in time before SjA all machines were busy processing jobs in {1, . . . , j − 1}, thus SjA ≤ p({1, . . . , j − 1})/m. It follows that CLA Also, Mj∗ p({1, . . . , j}) ≥ ≤ max P j∈L l∈{1,...,j} CL∗ p({1, . . . , j − 1}) + pj . m pl Ml∗ ≥ p({1, . . . , j})2 /2m. Then, ≥ max j∈L p({1, . . . , j}) pj + 2m 2 . We conclude that CLA ≤ 2CL∗ which implies that the algorithm returns a solution which is within a factor of 2 of OPT. Furthermore, note that this approach not only works for P P P |part| wL CL but also for P || wL CL . 2.5.3 Unrelated machines In the unrelated machine setting, our problem is also a common generalization of some classic machine scheduling problems. As before, if there is a single order and rij = 0, our problem becomes minimum makespan scheduling (R||Cmax ), in which the goal is to find a schedule of the n tasks in m unrelated machines so as to minimize the makespan. In a seminal work, Lenstra, Shmoys and Tardos [27] give a 2-approximation algorithm for R||Cmax , and showed that it is N P-hard to approximate it within a constant better than 3/2. Thus, the same 14 hardness result holds for R|| P wL CL . On the other hand, if orders are singletons and rij = 0, our problem becomes minimum P sum of weighted completion times scheduling (R|| wj Cj ). In this setting each job j ∈ J is associated, with a processing time pij , and a weight wj . The goal is to find a schedule so as to minimize sum of weighted completion times of jobs. As in the makespan case, the latter problem was shown to be APX-hard [23] and therefore there is no PTAS, unless P = N P. On the positive side, Schulz and Skutella [35] used a linear program relaxation to design an approximation algorithm with performance guarantee of 3/2 + ε in the case without release dates, and 2 + ε in the more general case. Furthermore, Skutella [38] refined this result by means of a convex quadratic programming relaxation obtaining a 3/2-approximation algorithm in the case of trivial release dates, and a 2-approximation algorithm in the more general case. Finally, it is worth mentioning that our problem also generalizes assembly scheduling P problems that have received attention recently, which we denote by A|| wj Cj (see e.g. [7, 8, 30]). As explained before, in this setting we are given a set M with m machines and a set of jobs J, with associated weights wj . Each job has m parts, one to be processed by each machine. So, pij denotes the processing time of the i-th part of job j, that must be processed on machine i. The goal is to minimize the sum of weighted completion times P ( wj Cj ), where the completion time Cj of job j is the time by which all of its parts have been processed. Thus, in our setting, a job with its m parts can be modelled as an order that contains m jobs. To ensure that each of the jobs on each order can only be processed on its correspondent machine, we give it infinity (or sufficiently large) processing time on all the others machines. Besides proving that the assembly line problem is N P-hard, Chen and Hall [7] and Leung, Li, and Pinedo [30] independently gave a simple 2-approximation algorithm based in the following linear programming relaxation of the problem: [LP] min X wj Cj j∈N X j∈S pij Cj ≥ pi (S)2 + p2i (S) /2 for all i = 1, . . . , m, S ⊆ N. Similarly to the 2-approximation described for P || 15 P wL CL in Section 2.5.2, the algorithm consists in processing jobs according to the order given by an optimal LP solution. Clearly, this is a 2-approximation. Indeed, consider C1 ≤ · · · ≤ Cn the optimal LP solution (after reordering if needed) and let S = {1, . . . , k}. Call C H and C ∗ the heuristic and the optimal P 2 completion time vectors respectively. Clearly, pi (S)Ck ≥ j∈S pij Cj ≥ pi (S) /2, hence P 2Ck ≥ pi (S) for all i ∈ M . It follows that CkH = max1≤i≤m pi (S) ≤ 2Ck , and then wj CjH ≤ P P 2 wj Cj ≤ 2 wj Cj∗ , and thus the solution constructed is an 2-approximation. 2.6 Contributions of this work P In this thesis we develop approximation algorithms for R|rij | wL CL and some of its particular cases. In Chapter 3 we begin by showing some techniques used in the subsequents sections. First, we review the result of Lawler and Labetoulle [25] showing that R|pmpt|Cmax , i.e. the problem of minimizing the makespan of preemptive jobs on unrelated machines, is polynomially solvable. Later, we propose a way of rounding any solution of R|pmpt|Cmax to a solution of R||Cmax , such that the cost of the solution is not increased in more than a factor of 4. For this we use the classic rounding technique of Shmoys and Tardos [37] for the generalized assignment problem. We conclude this chapter by showing that this rounding is best possible. To this end we construct a sequence of instances for which the ratio between its optimal preemptive makespan and its optimal nonpreemptive makespan is arbitrarily closed to 4. In Chapter 4 we generalize the techniques previously developed. We begin by giving a P (4 + ε)-approximation for R|pmpt, rij | wL CL , i.e. for each fixed ε > 0 we show a (4 + ε)approximation algorithm. The algorithm is based on a time-index linear program relaxation of the problem based on that of Dyer and Wolsey [13]. The rounding uses Lawler and Labetoulle’s [25] result, described in the previous chapter. Also we show a 27/2-approximation P algorithm for R|rij | wL CL . This is the first constant factor approximation algorithm for this problem, and thus improves the non-constant factor approximation algorithm for P Q|part| wL CL proposed by Leung et al. [29]. Our approach is based on an intervalindexed linear program proposed by Hall et al [21], and uses a very similar rounding to the one showed in Chapter 3. P In Chapter 5 we design a PTAS for P || wL CL , for the cases when the number of orders is constant, the number of jobs inside each order is constant, or the number of machines is constant. Our algorithm works in all three cases and thus generalizes the known PTASs 16 P in [1, 22, 40]. Our approach follows closely the PTAS of Afrati et al. [1] for P |rj | wj Cj . However, the main extra difficulty from that of Afrati et al. case, is that we might have orders that are processed through a long period of time, and its cost is only realized when it is completed. To overcome this issue, and thus be able to apply the dynamic programming ideas in [1], we simplify the instance and prove that there is a near-optimal solution in which every order is fully processed in a restricted time span. This requires some careful enumeration plus the introduction of artificial release dates. Finally, in Chapter 6 we summarize all the results, and then propose some possible directions for future investigation. 17 Chapter 3 On the power of preemption on R||Cmax In this chapter we study the problem of minimizing the makespan on unrelated machines, R||Cmax , that as was explained before, is a special case of our more general problem of minP imizing the sum of weighted completion time of orders on unrelated machines, R|| wL CL . The techniques in this chapter will give insight on how to give approximations algorithms for P P the more general problems R|rij | wL CL and R|rij , pmpt| wL CL . In Section 3.1, we begin by reviewing the technique developed by Lawler and Labetoulle [25] to solve R|pmtn|Cmax , that shows that this problem is equivalent to solving a linear program. In Section 3.2, we give a quick overview of Lenstra, Shmoys and Tardos’s [27] 2-approximation algorithm for R||Cmax , and discuss why it is difficult to apply those ideas to our more general setting. Then, we show how we can modify this result, getting one easier to generalize. By doing this we obtain a rounding that turns any preemptive schedule to a nonpremptive one, such that the makespan is not increased in more than a factor of 4. On the other hand, in Section 3.3, we prove that this factor is best possible, i.e. there is no rounding that converts a preemptive schedule to a nonpreemtive one with a guarantee better than 4. We achieve this by iteratively constructing a family of almost tight instances. 3.1 R|pmtn|Cmax is polynomially solvable We now present the algorithm developed by Lawler and Labetoulle, that computes the optimal solution of R|pmtn|Cmax . It is based on a linear programming formulation that uses 18 assignment variables xij , indicating the fraction of job j ∈ J that is processed on machine i ∈ M . With this, it will be enough to give a way of converting any feasible solution of this linear program to a preemptive schedule of equal makespan, i.e., we need to find a way of distributing the fractions of each job inside each machine, such that no two fraction of the same job are processed in parallel. More precisely, let us consider the following linear program, [LL] min C X xij = 1 for all j ∈ J, (3.1) pij xij ≤ C for all i ∈ M, (3.2) pij xij ≤ C for all j ∈ J, (3.3) xij ≥ 0 for all i, j. (3.4) i∈M X j∈J X i∈M It is clear that each preemptive schedule induces a feasible solution to [LL]. Indeed, given any preemptive solution, denote C its makespan and xij the fraction of job j that is processed on machine i. In other words, if yij denotes the amount of time that the schedule uses to process job j on machine i, then xij = yij /pij . With this definition, the solution must satisfy Equation (3.1) since every job is always completely scheduled. Furthermore, Equation (3.2) P is also satisfied since no machine i ∈ M can finish processing before j pij xij . Similarly, Equation (3.3) holds since no job j can be processed in two machines at the same time, and thus the left hand side of this equation is a lower bound on the completion time of job j. Let xij and C be any feasible solution of [LL]. Consider the following algorithm that creates a preemptive schedule of makespan C. Algorithm: Nonparallel Assignment 1. Define the values zij := pij xij /C, for all i ∈ M and j ∈ J. Note that the vector (zij )ij belongs to the matching polyhedron P , of all yij ∈ Rnm satisfying the following inequalities: 19 X i∈M X j∈J yij ≤ 1 for all j ∈ J, (3.5) yij ≤ 1 for all i ∈ M, (3.6) yij ≥ 0 for all i, j. (3.7) Also, note that P is integral, since the matrix that defines it is totally unimodular (see for example [34] Ch. 18). 2. Note that by Caratheodory’s theorem [14, 16] it is possible to decompose vector z as a convex combination of a polynomial number of vertices of P . More precisely, we can T find vectors Z k ∈ {0, 1}nm P and scalars λk ≥ 0 for k = 1, . . . mn + 1, such that P Pmn+1 k zij = nm+1 k=1 λk Zij and k=1 λk = 1. 3. Build the schedule as follows. For each i ∈ M, k = 1, . . . , nm + 1 such that Zijk = 1, Pk−1 P schedule job j in machine i, between time C ℓ=1 λℓ and C kℓ=1 λℓ . We first show the correctness of the algorithm, and later show that it can be execute in polynomial time. Lemma 3.1. Let us consider xij and C satisfying equations (3.2), (3.3) and (3.4). Algorithm: Nonparallel Assignment constructs a preemptive schedule of makespan at most C, where the fraction of job j ∈ J processed on machine i ∈ M is xij . Proof. First, note that for each i ∈ M and j ∈ J Algorithm: Nonparallel Assignment process job j during pij xij units of time in machine i. Indeed, for each k = 1, . . . , nm + 1, i ∈ M and j ∈ J such that Zijk = 1, the amount of time job j is processed on machine i equals Cλk . Then, since Z k is binary, the total amount of time job j is processed in machine i equals nm+1 X Cλk Zijk = Czij = pij xij . k=1 Then, the fraction of job j that is processed in machine i is xij . Furthermore, no job is processed in two machines at the same time. Indeed, if by contradiction we assumed that there is a job that is processed in parallel, then there exist 20 k k ∈ 1, . . . , mn + 1, j ∈ J and i, d ∈ M such that Zijk = Zdj = 1. This implies that Pm k k i=1 Zij ≥ 2, contradicting that Z belongs to P . Finally, the makespan of the schedule is at most C, since the algorithm only assigns jobs P between time 0 and C mn+1 k=1 λk = C. With this the following holds. Corollary 3.2. To each feasible solution xij , C of [LL] corresponds a preemptive schedule of makespan C and vice-versa. Thus, to solve R|pmtn|Cmax it is enough to compute the optimal solution of [LL], and then turn it to a preemptive schedule using Algorithm: Nonparallel Assignment. Finally, we show that this algorithm runs in polynomial time. Lemma 3.3. Algorithm: Nonparallel Assignment runs in polynomial time. Proof. We just need to show that step (2) can be done in polynomial time. For this, consider any polytope P = {x ∈ RN |Ax ≤ b} for some matrix A ∈ M(R)K×N and vector b ∈ RK . For any z ∈ P , we need to show how to decompose z as a convex combinations of vertices of P . Clearly, it is enough to decompose z = λZ + (1 − λ)z ′ , where λ ∈ [0, 1], Z is a vertex of P , and z ′ belong to some proper face P ′ of P . Indeed, if this can be done, we can then interate the argument over z ′ ∈ P ′ . This procedure will finish after N steps since the dimension of the polytope is decreased after each iteration. For this, consider z ∈ P . Find any vertex Z ∈ P , which can be done for example, by minimizing a linear function over the polytope P . We define z ′ by projecting z into the frontier of P . For this, let γ̂ = max {γ ≥ 1|Z + γ(z − Z) ∈ P }. In other words, if Ai denotes the i-th row of A, then γ̂ = min i=1,...,K bi − Ai · Z Ai · (z − Z) 6= 0 . Ai · (z − Z) With this, define z ′ := Z + γ̂(z −Z) ∈ P , implying that z = z ′ /γ̂ +Z(γ̂ −1)/γ̂. Thus, defining λ := 1/γ̂ ≤ 1 we get that z = λz ′ + (1 − λ)Z. Finally, note that z ′ belongs to a proper face of P . For this, it is enough to show that there is i∗ ∈ {1, . . . , K} such that Ai∗ · z ′ = bi∗ and Ai∗ · Z < bi∗ , which is clear from the choice of γ̂. Then, the face P ′ ∋ z ′ equals, P ′ := x ∈ RN A′ x ≤ b′ , 21 where A′ := A −Ai∗ ! and b′ := b −bi∗ ! . Note that the complexity of this algorithm is O((V + KN ) · N ), where V denotes the complexity of finding a vertex of P . In general, V can be done using the ellipsoid method, but in our particular problem it can be done much faster. Indeed, finding a vertex of a face in a matching polyhedron of a bipartite graph can be formulated as finding a matching over a bipartite graph, with the extra restriction that a given subset of vertices must be covered. Clearly this can be done by finding a maximum weight matching, which can be solved in O(n2 · m) ([34], Ch. 17.2), where n is the number of jobs and m the number of machines. Finally, since N = nm, the time complexity of the algorithm is O(n3 · m2 ). 3.2 A new rounding technique for R||Cmax In 1990, Lenstra, Shmoys and Tardos [27] gave a 2-approximation algorithm for the problem of minimizing the makespan on unrelated machines. For this, they noticed that if the value of the optimal makespan Cmax was known, they could formulate the problem as finding an integer feasible solution of a polytope. This polytope, that uses assignment variables of jobs to machines xij , is defined by the following set of linear inequalities. [LST] X xij = 1 for all j ∈ J, pij xij ≤ C for all i ∈ M, xij = 0 if pij > C, xij ≥ 0 for all i, j. i∈M X j∈J (3.8) Then, if we can find a feasible integral solution of this polytope in polynomial time, then we could solve R||Cmax by doing binary search on C to estimate Cmax . To obtain a 2-approximation algorithm, Lenstra et. al relaxed the integrality contrains of this feasibility problem, and proposed a rounding technique that turns any vertex of [LST] to a feasible schedule with makespan at most 2C. Later, Shmoys and Tardos [37] refined this rounding so they could turn any feasible solution of [LST] (not just a vertex) into a 22 schedule, without increasing the makespan in more than a factor of 2. Shmoys and Tardos used this new technique to design an approximation algorithms for the generalized assignment problem. The main technical difficulty to generalize Lenstra et al.’s rounding technique to our more P general problem R|| wL CL , relays on the fact that the value of the optimal makespan must be previously known or guessed by a binary search procedure, thing that is not clear how P to do in R|| wL CL . To overcome this, we further relax [LST] by replacing Equation (3.8) with Equation (3.3), and thus removing the nonlinearity on the value of the makespan. With this, we have removed the necessity to estimate Cmax by a binary search procedure, since we can just minimize the makespan C over a polytope. In other words, we can use the solution of the linear program [LL] as a lower bound of our problem. In what follows we show how to round any fractional solution of [LL] to an integral one, such that the makespan increases in at most a factor of 4. By Corollary 3.2, this is equivalent to turning any preemptive schedule to a nonpreemptive one, such that the makespan is increase in no more than a factor of 4. Let x and C be a feasible solution of [LL]. The rounding proceeds in two steps: First, we eliminate fractional variables whose corresponding processing time is too large; Then, we use the rounding technique of Shmoys and Tardos [37] as a subroutine. This result is subsumed in the next theorem. Theorem 3.4 (Shmoys and Tardos [37]). Given a nonnegative fractional solution to the following system of equations: XX j∈J i∈M cij xij ≤ C, X (3.9) for all j ∈ J, xij = 1, i∈M (3.10) there exists an integral solution x̂ij ∈ {0, 1} satisfying (3.9),(3.10), and also, xij = 0 =⇒ x̂ij = 0 X X pij x̂ij ≤ pij xij + max{pij : xij > 0} j∈J j∈J for all i ∈ M, j ∈ J, for all i ∈ M. Furthermore, such integral solution can be found in polynomial time. 23 (3.11) To begin our rounding, we first define a modified solution x′ij as follows: x′ij = 0 xij Xj where Xj = P i:pij ≤2C if pij > 2C ∗ (3.12) else, xij for all j ∈ J. Note that, 1 − Xj = X i:pij >2C xij ≤ X i:pij >2C xij pij 1 < , 2C 2 where the last inequality comes from Equation (3.3). Thus Xj > 1/2, which implies that x′ij satisfies X j∈J x′ij ≤ 2xij for all j ∈ J, i ∈ M, x′ij ≤ 2C for all i ∈ M. Also, note that by construction the following is also satisfied. X x′ij = 1 for all j ∈ J, x′ij = 0 if pij > 2C. i∈M Then, we can apply Theorem 3.4 to x′ij (for cij = 0), to obtain a feasible integral solution x̂ij to [LL], such that for all i ∈ M , X j∈J x̂ij pij ≤ X j∈J x′ij pij + max{pij : xij > 0} ≤ 2C + 2C = 4C. Thus, the rounded solution is within a factor 4 of the fractional solution. 3.3 Power of preemption of R||Cmax We now show that the integrality gap of [LL] is at least 4. This, together with the rounding developed in the previous section, implies that the integrality gap of [LL] is exactly 4. As discussed in Section 2.2, this means that it is not possible to construct a rounding with a 24 factor better than 4, thus implying that the naive rounding developed on the previous section is best possible. Let us fix β ∈ [2, 4), and ε > 0 such that 1/ε ∈ N. We now construct an instance I = I(β, ε) such that its optimal nonpreemptive makespan is at most C(1 + ε), and that any nonpreemptive solution of I has makespan at least βC. The construction is done iteratively, maintaining at each iteration a preemptive schedule of makespan (1 + ε)C , and where the makespan of any nonpreemptive solution is increased. During the construction of the instance, we will interchangeable use the equivalence between feasible solutions of [LL] and preemptive schedules given by Corollary 3.2. 3.3.1 Base case We begin by constructing an instance I0 , which will later be our first iteration. To this end consider a set of 1/ε jobs J0 = {j(0; 1), j(0; 2), . . . , j(0; 1/ε)} and a set of 1/ε + 1 machines M0 = {i(1), i(0; 1), . . . , i(0; 1/ε)}. Every job j(0; ℓ) can only be processed in machine i(0; ℓ), where it takes βC units of time to process, and in machine i(1), where it takes a very short time. More precisely, for all ℓ = 1 . . . , 1/ε we define, pi(0;ℓ)j(0;ℓ) := βC, pi(1)j(0;ℓ) := εC β , β−1 The rest of the processing times are defined as infinite. Note that a feasible fractional assignment is given by setting xi(0;ℓ)j(0;ℓ) = 1/β and xi(1)j(0;ℓ) = f0 := (β − 1)/β and setting to zero all other variables. The makespan of this fractional solution is exactly (1 + ε)C. P Indeed, the load of each machine i ∈ M0 , j∈J0 xij pij , equals C. Also, the load associated to P each job j ∈ J0 , i∈M0 xij pij , equals C + εC. Furthermore, no nonpreemptive solution with makespan less than βC can have a job j(0; ℓ) processed in machine i(0; ℓ), and therefore all jobs must be processed in i(1). This yields a makespan of C/f0 = βC/(β − 1). Therefore, the makespan of any nonpreemptive solution is min{βC, C/f0 }. Note that if β is chosen as 2, the makespan of any nonpreemptive solution must be at least 2, and therefore the gap of the instance tends to 2 when ε tend to zero. 25 I0 i(0; 1) i(0; ℓ) xij = 1/β pij = βC i(0; 1/ε) j(0; ℓ) xij = (β−1) β β pij = εC (β−1) i(1) {z C | } Figure 3.1: Instance I0 and its fractional assignment. The values over the arrows xij and pij denote the fractional assignment and the processing time respectively. 3.3.2 Iterative procedure To increase the integrality gap we proceed iteratively as follows. Starting from instance I0 , which will be the base case, we show how to construct instance I1 . As we will show later, an analogous procedure can be used to construct instance In+1 from instance In . Begin by making 1/ε copies of instance I0 , I0l for l = 1, . . . , 1/ε, and denote the set of jobs and machines of I0l as J0l and M0l respectively. Also, denote as i(1; ℓ) the copy of machine i(1) belonging to M0l (see Figure 3.2). Consider a new job j(1) for which pi(1;ℓ)j(1) = C(β −β/(β −1)) for all ℓ = 1, . . . , 1/ε (and ∞ otherwise), and define xi(1;ℓ)j(1) = εC/pi(1;ℓ)j(1) . This way, the load of each machine i(1; ℓ) in the fractional solution is (1 + ε)C, and the load corresponding to job j(1) is exactly C. Nevertheless, depending on the value of β, job j(1) √ may not be completely assigned. A simple calculation shows that for β = (3+ 5)/2, job j(1) 26 T1 I01 i(1; 1) I0ℓ 1 0 0 1 0 1 1/ε I0 1 0 0 1 0 1 i(1; ℓ) xij = i(1; 1ε ) 1 0 0 1 0 1 ε β β− β−1 pij = C(β − β ) β−1 j(1) Figure 3.2: Instance T1 and its fractional assignment. The values over the arrows xij and pij denote the fractional assignment and the processing time respectively. is completely assigned in the fractional assignment. Furthermore, as justified before, in any nonpreemptive schedule of makespan less than βC, all jobs of instance I0l must be processed on machine i(1; ℓ). Since also job j(1) must be processed on some machine i(1; ℓ) then the P load of that machine must be j∈J ℓ pi(1;ℓ)j + pi(1;ℓ)j(1) = Cβ/(β − 1) + C(β − β/(β − 1)) = βC. 0 √ Then, the gap of the instance already constructed converges to β = (3 + 5)/2 when ε tend to 0, thus improving the gap of 2 shown before. √ On the other hand, for β > (3 + 5)/2 (as we would like) there will be some fraction of job j(1), 1/ε X (β − 1)2 − β xi(1;ℓ)j(1) = f1 := 1 − β(β − 1) − β ℓ=1 that must be processed elsewhere. To overcome this, we do as follows. Let us denote the S1/ε S1/ε instance consisting of jobs ℓ=1 J0l and machines ℓ=1 M0ℓ as T1 , and construct 1/ε copies of instance T1 , T1k for k = 1, . . . , 1/ε. Also, consider 1/ε copies of job j(1), and denote them by j(1; k) for k = 1, . . . , 1/ε (see Figure 3.3). As shown before, we can assign a fraction 1 − f1 of each job j(1; k) to machines of T1k . To assign the remaining fraction f1 , we add an extra machine i(2), with pi(2)j(1;ℓ) := εC/f1 (and ∞ for all other jobs), so that the fraction f1 of each job j(1; ℓ) takes exactly εC to process in i(2). Then, defining xi(2)j(1;ℓ) = f1 , the total load of each job j(1; ℓ) does not exceed (1 + ε)C, while the load of machine i(2) is exactly C. 27 Let us denote the instance we have constructed so far as I1 . Notice that I1 is analogous to I0 in the sense that both satisfy the following properties for n = 0, 1, (i) In any nonpreemptive solution of makespan less than βC, every job j(n; ℓ) must be processed on machine i(n + 1). Therefore the makespan of any nonpreemptive solution is at least min{βC, C/fn }. (ii) The makespan of the fractional solution constructed is (1 + ε)C. In particular the load of machine i(n + 1) is C, and therefore a fraction of a job which takes less than εC can still be processed on this machine without increasing the makespan. Furthermore, it is easy to show that C/f0 < C/f1 for β > 2, i.e. the makespan of any nonpreemptive solution increased from I0 to I1 , and thus the integrality gap of the instance also increased. In the following we generalize the ideas shown before, and describe the construction of an instance with integrality gap arbitrarily close to β, for any β ∈ [2, 4). Procedure I 1. Construct I0 , f0 , and i0 as in Section 3.3.1, and let n = 0. 2. While fn > 1/(β − 1), we construct instance I(n + 1) as follows. (a) Construct an instance Tn+1 consisting of 1/ε copies of instance In , that we denote as Inl , for ℓ = 1, . . . , 1/ε, where the copy of machine i(n) belonging to Inl is denoted by i(n; ℓ). k (b) Create 1/ε copies of Tn+1 , Tn+1 for k = 1, . . . , 1/ε. Denote the ℓ-th copy of instace k In belonging to instance Tn+1 as Inℓk , and the copy of machine i(n+1) that belongs to instance Inℓk as i(n + 1; ℓ, k). (c) Create 1/ε new jobs, j(n + 1; k), for k = 1, . . . , 1/ε, and let pi(n+1;ℓ,k)j(n+1;k) = C(β − 1/fn ) for all k, ℓ = 1, . . . , 1/ε (and ∞ for all other machines). We define the assignment variables for this new jobs as: xi(n+1;ℓ,k)j(n+1;k) := ε β − 1/fn 28 for all k, ℓ = 1, . . . , 1/ε. This way, the unassigned fraction of each job j(n + 1; k) equals fn+1 := 1 − = 1/ε X xi(n+1;ℓ,k)j(n+1;k) (3.13) ℓ=1 (β − 1)fn − 1 . βfn − 1 (3.14) (d) To assign the remaining fraction of jobs j(n + 1; k) for k = 1, . . . , 1/ε, we create a new machine i(n + 2), and define pi(n+2)j(n+1;k) = εC/fn+1 for all k = 1, . . . , 1/ε (and ∞ for all other jobs). With this we can let xi(n+2)j(n+1;k) = fn+1 , so that this way the load of each job j(n + 1; k) and machine i(n + 2) are (1 + ε)C and C respectively. (e) Call In+1 the instance constructed so far, and redefine n ← n + 1. Observe that the defined assignment guarantees that the optimal preemptive makespan for In+1 is at most (1 + ε)C. 3. If fn ≤ 1/(β − 1), that is, the first time the condition of step (2) is not satisfied, we do half an iteration as follows. (a) Make 1/ε copies of In , Inℓ for ℓ = 1, . . . , 1/ε, and call i(n+1; ℓ), the copy of machine i(n + 1) belonging to Inℓ . (b) Create a new job j(n+1), and define pi(n+1;ℓ)j(n+1) := C(β−1/fn ) and xi(n+1;ℓ)j(n+1) := ε. Notice that this way job j(n + 1) is completely processed in the preemptive solution, and the makespan of the preemptive solution is still (1 + ε)C, since the load of job j(n + 1) equals C(β − 1/fn ) ≤ C. (c) Return In+1 , the instance thus constructed. Lemma 3.5. If Procedure I finishes, then it returns an instance with a gap of at least β/(1 + ε). Proof. It is enough to show that if the procedure finishes then the makespan of any nonpreemptive solution is at least βC. We proceed by contradiction, assuming that instance In∗ returned by Procedure I has makespan strictly less than βC. Note that for the latter to hold any job j in In∗ has to be assigned to the last machine i added by Procedure I for which pij < ∞ (this is obvious for jobs in I0 , and follows inductively for jobs in In , n ≤ n∗ ). 29 In+1 1/ε Tn1 Tn 1,1/ε In1,1 1/ε,1 11 00 00 11 00 11 1/ε,1/ε In In 11 00 00 11 00 11 In 1 0 0 1 0 1 1 0 0 1 0 1 j(n + 1; 1/ε) j(n + 1; 1) xij = fn+1 xij = fn+1 1111111 0000000 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 i(n + 2) | {z C } | {z εC } Figure 3.3: Construction of instance In+1 (β). This implies that the load of all machines i(n∗ ; ℓ) (which where the last machines included) due to jobs different from j(n∗ ) equals C/fn∗ −1 . Indeed, for each job j that was fractionally assigned to any of these machines had xi(n∗ ;ℓ)j = fn∗ −1 , and i(n∗ ; ℓ) was the last machine for which pij was bounded. Thus, as all machines i(n∗ ; ℓ) had load C in the fractional assignment they will have load C/fn∗ −1 in the nonpreemptive solution. Furthermore, job j(n∗ ), for which pi(n∗ ,ℓ)j(n∗ ) = C(β − 1/fn∗ −1 ) must be processed in some machine i(n∗ , ℓ̄). Thus, the load of machine i(n∗ , ℓ̄) is C/fn∗ −1 + C(β − 1/fn∗ −1 ) = βC, which is a contradiction. To prove that the procedure in fact finishes, we first show a technical lemma. Lemma 3.6. For each β ∈ [2, 4), if fn > 1/β, then fn+1 ≤ fn . Proof. It follows from Equation (3.14) that, fn+1 − fn = −βfn2 + βfn − 1 . βfn − 1 Note that the numerator of the last expression is always positive since the square equation 30 −βx2 + βx − 1 has no real roots for 0 ≤ β < 4. The result follows since, by hypothesis, the denominator of this expression is always positive. Lemma 3.7. Procedure I finishes. Proof. We need to show that for every β ∈ [2, 4), there exist n∗ ∈ N such that fn∗ ≤ 1/(β −1). If this does not hold, then fn > 1/(β − 1) > 1/β for all n ∈ N. Then Lemma (3.6) implies that {fn }n∈N is a decreasing sequence. Therefore fn must converge to some real number L ≥ 1/(β − 1). Thus, Equation (3.14) implies that L= (β − 1)L − 1 , βL − 1 and therefore L is a root of equation −βx2 + βx − 1 which is a contradiction. We have proved the following theorem. Theorem 3.8. For each β ∈ [2, 4) and ε > 0, there is an instance I of R||Cmax , for which the optimal preemptive makespan is at most C(1 + ε), and the optimal nonpreemptive makespan is at least βC. Corollary 3.9. The integrality gap of [LL] is 4. 31 Chapter 4 Approximation algorithms for P minimizing wLCL on unrelated machines P In this chapter we present approximation algorithms for the general case R|rij | wL CL P and its preemptive version R|rij , pmtn| wL CL . Most of the techniques used for this are generalization of the methods shown in the previous chapter. 4.1 A (4 + ε)−approximation algorithm for P R|rij , pmtn| wLCL In the following we present a (4 + ε)-approximation algorithm for the preemptive version of P R|rij | wL CL . This means that for each ε > 0 we give a (4 + ε)-approximation algorithm, whose running time is polynomial on the size of the input and 1/ε. From now on we will assume without loss of generality that all processing time pij are integers greater or equal than 1. If this is not the case we can discard the cases when pij = 0 as trivial and scale the remaining processing times. The algorithm developed in this section is based on a time-indexed linear program, whose variables represent the fraction of each job that is processed at each (discreet) point in time on each machine. This kind of linear relaxation was originally introduced by Dyer and Wolsey P [13] for the problem 1|rj | j wj Cj , and was later extended by Schulz and Skutella [35], who P used it to obtain a (3/2 + ε)-approximation and a (2 + ε)-approximation for R|| wj Cj and 32 R|rj | P wj Cj respectively. Let us consider a time horizon T , large enough so it upper bounds the greatest completion P time of any reasonable schedule, for instance T = maxi∈M,k∈J {rik + j∈J pij }. We divide the time horizon into exponentially-growing time intervals, so that there is only polynomially many of them. For that, let ε be a fix parameter, and let q be the first integer such that (1 + ε)q−1 ≥ T . Then, we consider the intervals [0, 1], (1, (1 + ε)], ((1 + ε), (1 + ε)2 ], . . . , ((1 + ε)q−2 , (1 + ε)q−1 ]. To simplify the notation, let us define τ0 = 0, and τℓ = (1 + ε)ℓ−1 , for each ℓ = 1, . . . , q. With this, the ℓ-th interval corresponds to (τℓ−1 , τℓ ]. Given any preemptive schedule, let yjiℓ the fraction of job j that is processed on machine i in the ℓ-th interval. Then, pij yjiℓ is the amount of time that job j is processed on machine i in the ℓ-th interval. Consider the following linear program: [DW] min X wL CL L∈O q XX for all j ∈ J, (4.1) pij yjiℓ ≤ τℓ − τℓ−1 for all ℓ = 1, . . . , q and i ∈ M, (4.2) pij yjiℓ ≤ τℓ − τℓ−1 for all ℓ = 1, . . . , q and j ∈ J, (4.3) for all L ∈ O, j ∈ L, (4.4) yjiℓ = 0 for all j, i, ℓ : rij > τℓ , (4.5) yjiℓ ≥ 0 for all i, j, ℓ. (4.6) yjiℓ = 1 i∈M ℓ=1 X j∈J X i∈M X i∈M q yji1 + X ℓ=2 τℓ−1 yjiℓ ! ≤ CL It is easy to see that this is a relaxation of our problem. Indeed, Equation (4.1) assures that every job is completely processed. Equation (4.2) must hold since in each interval ℓ and machine i the total amount of time available is at most τℓ − τℓ−1 . Similarly, Equation (4.3) holds since no job can be simultaneously processed in two machines at the same time, and therefore for a fixed interval the total amount of time that can be used to process a job is at 33 most the length of the interval. To see that Equation (4.4) is valid notice that pij ≥ 1, and thus CL ≥ 1 for all L ∈ O. Also notice that CL ≥ τℓ−1 for all L, j ∈ L, i, ℓ such that yjiℓ > 0. Thus, the left hand side of Equation (4.4) is a convex combination of values smaller than CL . Finally, Equation (4.5) must hold since no part of a job can be assigned to an interval that finishes before the release date in any given machine. As usual in approximation algorithms based on linear relaxations, we first compute the optimal solution of [DW], and then transform it into a preemptive schedule whose cost is within a constant factor from the optimal cost of [DW]. To construct the schedule we do as follows. For any job j ∈ L, we truncate to zero all its variables that assign part of it to an interval considerably later than CL . Afterwards, we use Algorithm: Nonparallel Assignment (see Section 3.1) to construct a feasible schedule inside each interval, making sure that no job is processed in two machines at the same time. ∗ More precisely, let yjiℓ and CL∗ be the optimal solution of [DW]. Let j ∈ J, and L = arg min{CL∗ ′ ∈ O| L′ ∋ j}. For a given parameter β > 1 (which will be appropriately chosen later), we define: 0 if τℓ−1 > βCL∗ ′ yjiℓ = y∗ (4.7) ∗ jiℓ if τ ≤ βC , ℓ−1 L Yj where, Yj = X i∈M X ∗ yjiℓ . ∗ ℓ: τℓ−1 ≤β·CL The modified solution y ′ satisfies the following lemma. ′ ∗ Lemma 4.1. The modified solution yjiℓ , obtained by applying Equation (4.7) to yjiℓ satisfies, q XX ′ yjiℓ =1 for all j, (4.8) i∈M ℓ=1 X j∈J X i∈M ′ pij yjiℓ ≤ β (τℓ − τℓ−1 ) β−1 for all i, ℓ, (4.9) ′ pij yjiℓ ≤ β (τℓ − τℓ−1 ) β−1 for all j, ℓ, (4.10) for all L ∈ O, j ∈ L. (4.11) ′ yjiℓ = 0 if τℓ−1 > βCL∗ 34 ′ Proof. It is clear that yjiℓ satisfies (4.8) since: q X X i∈M ℓ=1 ′ yjiℓ ∗ X X yjiℓ 1 X = = Yj Yj i∈M ′ i∈M yjiℓ >0 X ∗ yjiℓ = 1. ∗ ℓ:τℓ−1 ≤β·CL Furthermore, to show that equations (4.9) and (4.10) holds, note that, 1 − Yj = ≤ ≤ X i∈M X X ∗ yjiℓ ∗ ℓ: τℓ−1 >β·CL X ∗ i∈M ℓ: τℓ−1 >β·CL ∗ CL 1 = . ∗ βCL β ∗ yjiℓ τℓ−1 βCL∗ The last inequality follows from Equation (4.4), and by noting that ℓ 6= 0 whenever τℓ−1 > β ∗ ′ yjiℓ . With this, equations (4.9) and (4.10) β · CL∗ . Then, Yj ≥ (β − 1)/β, and thus yjiℓ ≤ β−1 follow from equations (4.2) and (4.3). Finally, note that Equation (4.11) follows directly from the definition of y ′ . ′ Equation (4.11) in the previous lemma implies that the variables yjiℓ only assign jobs to ∗ intervals that finish before βCL in case j ∈ L. On the other hand, as shown by equations (4.9) and (4.10), the amount of load assign to each interval may not fit in the available time span τℓ − τℓ−1 . Thus, we will have to increase the size of every interval in a factor β/(β − 1). With the latter observations, we are ready to describe the algorithm. Algorithm: Greedy Preemptive LP 1. Solve [DW] to optimality and call the solution y ∗ and (CL∗ )L∈O . ′ 2. Define yjiℓ using Equation (4.7). 3. Construct a preemptive schedule S as follows. ′ (a) For each ℓ = 1, . . . , q, define xij = yjiℓ and C = (τℓ − τℓ−1 )β/(β − 1), and apply Algorithm: Nonparallel Assignment to this fractional solution. Call the preemptive schedule obtained Sℓ . (b) For each job j ∈ J that is processed by schedule Sℓ at time t ∈ [0, C] in machine i ∈ M , make schedule S process j in machine i at time t + τℓ−1 β/(β − 1). 35 Lemma 4.2. Algorithm: Greedy Preemptive LP constructs a feasible schedule where the completion time of each order L ∈ O is less than CL∗ (1 + ε)β 2 /(β − 1). ′ Proof. Note that equations (4.9) and (4.10) implies that for each ℓ = 1, . . . , q, xij = yjiℓ and C = (τℓ − τℓ−1 )β/(β − 1) satisfies equations (3.2) and (3.3). Then, by Lemma 3.1, the makespan of each schedule Sℓ is less than (τℓ − τℓ−1 )β/(β − 1), and thus the schedule Sℓ defines the schedule S in the disjoint amplified interval [τℓ−1 β/(β − 1), τℓ β/(β − 1)). Also, it follows from Lemma 3.1 and Equation (4.8) that the schedule S completely process every job. To bound the completion times of the orders, consider a fixed order L ∈ O and job j ∈ L. ′ Let ℓ∗ be the last interval for which yjiℓ > 0, for some machine i ∈ M . I.e., ′ ℓ∗ = max max{ℓ ∈ {1, . . . , q}|yjiℓ > 0}. i∈M Then, the completion time Cj is smaller than τℓ∗ β/(β − 1). To further bound Cj , we consider two cases. If ℓ∗ = 1 then, β β Cj ≤ ≤ CL∗ (1 + ε) , β−1 β−1 where the last inequality follows since CL∗ ≥ 1. On the other hand, if ℓ∗ > 1, Equation (4.11) implies that β β2 β ≤ τℓ∗ −1 (1 + ε) ≤ CL∗ (1 + ε) . Cj ≤ τℓ∗ β−1 β−1 β−1 Thus, by taking the maximum over all j ∈ L, the completion time of the order L is upper bounded by CL∗ (1 + ε)β 2 /(β − 1). Theorem 4.3. Algorithm: Greedy Preemptive LP is a (4 + ε)-approximation for β = 2. Proof. Let CL be the completion time of order L given by Algorithm: Greedy Preemptive LP. Taking β = 2 in the last lemma, which is the optimal choice, it follows that CL ≤ CL∗ (1 + ε)β 2 /(β − 1) = 4(1 + ε)CL∗ . Then, multiplying CL by its weight wL and adding over all L ∈ O, we conclude the the cost of the schedule constructed is no larger than 4(1 + ε) times the cost of the optimal solution to [DW], which is a lower bound on the cost of the optimal preemptive schedule. 36 4.2 A constant factor approximation for R|rij | P wLCL P We now give the first constant factor approximation for the general problem R|rij | wL CL . Our algorithm is based on an interval-index linear programming relaxation developed by Hall, Schulz, Shmoys, and Wein [21], and on the rounding technique developed in Section 3.2. Similarly as before, we consider a time horizon T , large enough so it upper bounds the greatest completion time of any reasonable schedule, for example T = maxi∈M,k∈J {rik + P j∈J pij }. We also divide the time horizon into exponentially-growing time intervals, so that there is only polynomially many. For that, let α > 1 be a parameter which will determine later, and let q be the first integer such as αq−1 ≥ T . With this, consider the intervals [1, 1], (1, α], (α, α2 ], . . . , (αq−2 , αq−1 ]. To simplify the notation, let us define τ0 = 1, and τℓ = αℓ−1 , for each ℓ = 1 . . . q. With this, the ℓ-th interval corresponds to (τℓ−1 , τℓ ]. Let us remark that, in this setting, the first interval starts and finish at 1, contrary to the definition on the previous section where the first interval started at 0 and finished at 1. To model the scheduling problem we consider the variables yjiℓ , indicating whether job j is finished in the machine i and in the interval ℓ. These variables allow us to write the following linear program based on that in [21], which is a relaxation of the scheduling problem even when integrality constraints are imposed. [HSSW] min X wL CL L∈O q XX yjiℓ = 1 for all j ∈ J (4.12) pij yjis ≤ τℓ for all i ∈ M and ℓ = 1, . . . , q (4.13) for all L ∈ O, j ∈ L (4.14) yjiℓ = 0 for all i, ℓ, j : pij + rij > τℓ (4.15) yjiℓ ≥ 0 for all i, l, j. (4.16) i∈M ℓ=1 ℓ X X s=1 j∈J q XX i∈M ℓ=1 τℓ−1 yjiℓ ≤ CL It is clear that [HSSW] is a relaxation of our problem. Indeed, for any nonpreemptive 37 schedule, define yjiℓ = 1 iff job j finishes processing on machine i at the ℓ-th interval. Then, Equation (4.12) holds since each job finishes in exactly one interval and one machine. The left hand side of (4.13) corresponds to the total load processed in machine i in the interval [0, τℓ ], and therefore the inequality is valid. The double sum in inequality (4.14) corresponds exactly to τℓ−1 , where ℓ is the interval where job j finishes, so that is at most Cj , and therefore it is upper bounded by CL if j ∈ L. The rule (4.15) imposes that some variables must be set to zero before the LP is solved. This is valid since if pij + rij > τℓ then the job j will not be able to finish before τℓ in machine i, and therefore yjiℓ will be zero. ∗ Let (yjiℓ )jiℓ and (CL∗ )L be an optimal solution to [HSSW]. To obtain a feasible schedule we need to round such solution into an integral one. For the special case where all orders are singletons (as in Hall et al’s [21] situation), (4.14) becomes an equality, so that one can directly use Theorem 3.4, regarding each machine-interval pair of our problem as one machine in the algorithm, to round a fractional solution to an integral solution of smaller total cost. When doing this the righthand side of equation (4.13) is increased to τℓ + max{pij : yjiℓ > 0} ≤ 2τℓ , where the last inequality follows from (4.15). This can be used to derive a constant factor approximation algorithm for the problem. In our setting however, it is not possible to apply the theorem directly, due to the nonlinearity of the objective function. We thus take a detour ∗ in the same manner as in Section 4.1: we round down to zero all variables yjiℓ for which τℓ−1 ∗ is considerable bigger than a certain parameter β times CL , for L = argmin{CL′ |L′ ∋ j} (and ′ we will optimize over β later on). For that we define the variables yjiℓ using Equation (4.7). With this, the next lemma follows from similar calculations as Lemma 4.1. ′ Lemma 4.4. The modified solution yjiℓ ≥ 0 satisfies: q XX ′ yjiℓ =1 for all j ∈ J (4.17) for all i ∈ M (4.18) if pij + rij > τℓ or τℓ−1 > βCL∗ , ∀i, j, ℓ, L : j ∈ L (4.19) i∈M ℓ=1 ℓ X X s=1 j∈J ′ pij yjis ≤ β τℓ β−1 ′ yjiℓ =0 With the previous lemma on hand we are in position to apply Theorem 3.4. To do this we regard each interval-machine pair of our problem as one machine of the algorithm. In other words, defining the set of machines M ′ = M × {1, . . . , q} and xhj = yjh1 h2 for each h = (h1 , h2 ) ∈ M ′ , we round xhj to an integral solution x̂hj := ŷjh1 h2 ∈ {0, 1} satisfying 38 equations (4.17), (4.19) and X j∈J ŷjiℓ pij ≤ X j∈J ′ ′ yjiℓ pij + max{pij : yjiℓ > 0, j ∈ J} ≤ X ′ yjiℓ pij + τℓ , (4.20) j∈J where the first inequality follows from (3.11) and the second from (4.19). P We are now ready to give the algorithm for R|rij | wL CL . Algorithm: Greedy-LP ∗ (1) Solve [HSSW] obtaining an optimal solution (yjiℓ ) and (CL∗ )L . ′ (2) Modify the solution according to (4.7) to obtain (yjiℓ ) satisfying (4.17), (4.18), and (4.19). ′ (3) Round (yjiℓ ) using Theorem 3.4 to obtain an integral solution (ŷjiℓ ) as above. S (4) Let Jil = {j ∈ J : ŷjiℓ = 1}. Greedily schedule in each machine i, all jobs in qℓ=1 Jil , starting from those in Ji1 until we reach Jiq (with an arbitrary order inside each set Jil ), respecting the release dates. To break down the analysis let us first show that Greedy-LP is a constant factor approximation for the case in which all release dates are zero. P Theorem 4.5. Algorithm Greedy-LP is a (27/2)-approximation for R|| wL CL . Proof. Let us fix a machine i and take a job j ∈ L such that ŷjiℓ = 1, so that j ∈ Jil . Clearly, Cj , the completion time of job j in algorithm Greedy-LP, is at most the total processing S time of jobs in ℓk=1 Jik . Then, Cj ≤ ≤ ℓ X X pik ŷkis s=1 k∈J ℓ X X s=1 ′ pik ykis + τs k∈J l ! X β τℓ + τs β−1 s=1 βα α2 ≤ + τℓ−1 β−1 α−1 α β CL∗ . + ≤ βα β−1 α−1 ≤ 39 The second inequality follows from (4.20), the third from (4.18), and the fourth follows from the definition of τk . The last inequality follows since, by condition (3.11), ŷjiℓ = 1 implies ′ yjiℓ > 0, so that by (4.19) we have τℓ−1 ≤ βCL∗ . Optimizing over the factor approximation, the best possible factor given by this method is attained at α = β = 3/2, and therefore we conclude that Cj ≤ 27/2 · CL∗ . As this latter fact holds for all j ∈ L, we conclude that CL = maxj∈L Cj ≤ 27/2 · CL∗ . The claimed approximation factor follows directly by multiplying this inequality by wL , adding over all L ∈ O and using the fact that [HSSW] is a lower bound on the optimal schedule. Theorem 4.6. Algorithm Greedy-LP is a (27/2)-approximation for R|rij | P wL CL . Proof. Similarly to the proof of the previous theorem, we will show that the solution given by Algorithm Greedy-LP satisfies that Cj ≤ 27/2 · CL∗ , even in the presence of release dates. P P 1 ′ + ℓs=1 p y + τ . We will see that it is possible to schedule Let us define τ ℓ := α−1 ik s iks k∈J every set of jobs Jil in machine i between time τ ℓ−1 and time τ ℓ (with and arbitrary order inside each interval), respecting all release dates. Indeed, assuming that 1 < α ≤ 2, it follows from (3.11) and (4.19) that for every j ∈ Jil , ℓ−1 rij ≤ τℓ ≤ X τℓ − 1 1 1 + = + τk ≤ τ ℓ−1 . α−1 α−1 α − 1 k=1 Thus job j is available for processing in machine i at time τ ℓ−1 . On the other hand, note that P ′ τ ℓ − τ ℓ−1 = j∈J pij yjiℓ + τℓ , so it follows from (4.20) that all jobs in Jil fit inside (τ ℓ−1 , τ ℓ ]. We conclude that in the schedule constructed by Greedy-LP any job j ∈ Jil is processed before τ ℓ . Therefore, as in the previous theorem, 1 Cj ≤ + α−1 ℓ X X s=1 k∈J ′ pik yiks + τk ! ≤ α2 βα + β−1 α−1 Again, choosing α = β = 3/2 we obtain Cj ≤ 27/2 · CL∗ . 40 τℓ−1 ≤ βα β α + β−1 α−1 CL∗ . Chapter 5 A PTAS for minimizing parallel machines P wLCL on P In this chapter we design a PTAS for P |part| wL CL with some additional constraint. We assume that there are either a constant number of machines, a constant number of jobs per order or a constant number of orders. First, we will describe the case where the number of jobs of each order is bounded by a constant K, and then we will justify that this implies the existence of PTASs for the other cases. The results in this chapter closely follows the P PTAS developed for P |rj | wj Cj in [1]. However, it is technically more involved mainly for three reasons: Firstly, it is crucial to show that there is a near-optimal schedule such that the time-span of every order is small, and, furthermore, the precise localization of orders is significantly more complicated; Also, as we shall see later, it is important that all the nearoptimal solutions that we construct satisfy the properties of Lemma 5.1; Finally, we need to be slightly more careful in the final placing of jobs. As usual in the design of approximation schemes, the general idea is to add structure to the solution by modifying the instance in a way such that the cost of the optimal solution is not worsen in more than a (1 + ε) factor. Also, by applying several modifications to the optimal solution of this new instance we will prove that there exists a near-optimal solution that satisfies several extra properties. The structure given by this properties allow us to find this solution by enumeration or dynamic programming. As each one of the modifications that we are going to apply to the optimal solution only generates a loss of at most a factor of (1 + ε) in the cost, and we will apply only a constant number of them, we will end up with a solution that has cost within a factor of (1+ε)O(1) to the cost of the optimal scheduling. Then, 41 choosing a small enough ε we can approximate the solution up to any factor. To simplify the notation, in what follows we assume that all processing times are positive integers and that 1/ε is also an integer. Also, in what follows all the logarithms will be taken base (1 + ε), P unless it is explicitly stated. Besides, we will denote as p(L) = j∈L pj the total processing time of a set L ∈ J. As in the previous chapter, we will partition the time horizon in exponentially increasing intervals. For every integer t we will denote as It the interval [(1 + ε)t , (1 + ε)t+1 ), and we denote the size of the interval as |It |, then |It | = ε(1 + ε)t . Besides rounding and partitioning, a basic procedure we will use repeatedly is that of stretching, which consist in stretching the time axis by a factor of (1 + ε). Of course, this only worsen the solution in a factor of (1 + ε). The two basic stretching procedures we use are: 1. Stretch Completion Times: This procedure consists in delaying all jobs, so that the completion time of a job j becomes Cj′ = (1 + ε)Cj in the new schedule. This will increase the cost of the solution in exactly a factor of (1 + ε). This procedure creates a gap of idle time of size εpj before each job j. Indeed, if k was the job being processed just before j, then Cj′ − Ck′ = Cj − Ck + ε(Cj − Ck ) ≥ Cj − Ck + εpj . 2. Stretch Intervals: The objective of this procedure is to create idle time in every interval, except for those having a job that completely crosses them. As before, it consists on shifting jobs to the following interval. More precisely, if job j finishes in It and occupies dj time units in It , we will move j to It+1 by pushing it exactly |It | time units, so it also uses exactly dj time units in It+1 . Then, the completion time of the new schedule will be at most (1 + ε)Cj , and therefore the overall cost is increased by at most a factor (1 + ε). Note that, if j started processing in It and was being processed in It for dj time units, after doing the shifting will be processed in It+1 for at most dj time units. Since It+1 has ε|It | = ε2 (1 + ε)t more time units than It , at least that much idle time will be created in It+1 . Also, we can assume that this idle time is consecutive in each interval. Indeed, this can be accomplished by moving to the left as much as possible all jobs that are scheduled completely inside an interval. After applying the procedures we will also shift the index of the intervals to the right, so if a job was being processed in interval It , it will be still processed in It in the new schedule. 42 This will give the illusion that we have stretched time or intervals in a (1 + ε) factor. Before giving a general description of the algorithm we show that there exists an (1+ε)-approximate schedule where no order crosses more than O(1) intervals. For this, we first show the following basic property which is stated for the more general case of unrelated machines. Lemma 5.1. For any instance of R|part| P wL CL there exist an optimal schedule such that: 1. For any order L ∈ O and for any machine i = 1, . . . , m, all jobs in L assigned to i are processed consecutively. 2. The sequence in which the orders are arranged inside each machine is independent of the machine. Proof. Let us consider an optimal schedule of the problem. For a given order-machine pair L and i, let j ∗ be the last job in L that is assigned to i. It is easy to see that any job in L that is processed in i before j ∗ can be processed just before j ∗ without increasing the completion time of any order. With this we conclude the first property of the lemma. For the rest of the proof will assume that the optimum solution satisfies this property. For the second property, note that inside each machine the orders can be arrange following their completion times CL1 ≤ CL2 ≤ . . . ≤ CLk without increasing the cost of the solution. If this does not hold, then there exist two orders L, L′ such that CL ≤ CL′ and in some machine i the jobs of L′ are processed just before the ones in L. If this is the case then it is clear than interchanging these two sets of jobs in machine i does not increase the cost of the solution, since jobs in L will decrease their completion time while jobs in L′ will still complete before CL . Therefore, due to the fact that CL ≤ CL′ , the completion time of L′ in this new schedule will remain the same. The procedure can be iterated until the second property in the lemma is satisfied. Lemma 5.2. Let s := ⌈log(1 + 1/ε)⌉, then there exists an (1 + ε)-approximate schedule in which every order is fully processed in s + 1 consecutive intervals. Proof. Let us consider an optimal schedule as in Lemma 5.1 and apply Stretch Completion Times. Then we move all jobs to the right as much as possible without increasing the completion time of any order. Note that for any order L, each job j ∈ L increased its completion time by at least εCL . Indeed, if this is not the case let L be the last order (in terms of completion time) for which there exists j ∈ L that increased its completion time by less than εCL . Let i be the machine processing j. Lemma 5.1 implies that all 43 jobs processed in i after job j belong to orders that finish later than CL and thus they increase their completion time by at least εCL . As the completion time of order L was also increased by εCL , we conclude that job j could be moved to the right by εCL contradicting the assumption. This implies that after moving jobs to the right, the starting point of each order L ∈ O, SL , will be at least εCL , and therefore CL − SL ≤ SL /ε. Let Ix and Iy be the interval where L starts and finishes respectively, then (1+ε)y −(1+ε)x+1 ≤ CL −SL ≤ SL /ε ≤ (1/ε)(1+ε)x+1 , which implies that y − x − 1 ≤ log(1 + 1ε ) ≤ s. 5.1 Algorithm overview. In the following we describe the general idea of the PTAS. Let us divide the time horizon in blocks of s + 1 = ⌈log(1 + 1/ε)⌉ + 1 intervals, and denote as Bℓ the block [(1 + ε)ℓ(s+1) , (1 + ε)(ℓ+1)(s+1) ). Lemma 5.2 suggest to optimize over each block separately, and later put the pieces together to construct a global solution. Since there may be orders that cross from one block to the next, it will be necessary to perturb the “shape” of blocks. For that we introduce the concept of frontier. The outgoing frontier of block Bℓ is a vector that has m entries. Its i-th coordinate is a guess on the completion time of the last job scheduled in machine i among jobs that belong to orders that began processing in Bℓ (in Section 5.4 we will see that there is a concise description of frontier). On the other hand, the incoming frontier of a block is the outgoing frontier of the previous one. For a given block and incoming and outgoing frontiers, we will say that an order is scheduled inside block Bℓ if in each machine all jobs in that order begin processing after the incoming frontier, and finish processing before the outgoing frontier. Assume that we know how to compute a near-optimal solution for a given subset of orders V ⊆ O inside a block Bℓ , with fixed incoming and outgoing frontiers F ′ and F , respectively. Let W (ℓ, F ′ , F, V ) be the cost (sum of weighted completion times) of this solution. Let Fℓ be the set of possible outgoing frontiers of block Bℓ . Using dynamic programming, we can fill a table T (ℓ, F, U ) containing the cost of a near-optimal schedule for the subset of orders U ⊆ O in block Bℓ or before, respecting the outgoing frontier F of Bℓ . To compute this quantity, we can use the recursive formula: T (ℓ + 1, F, U ) = min F ′ ∈Fℓ ,V ⊆U {T (ℓ, F ′ , V ) + W (ℓ + 1, F ′ , F, U \ V )}. 44 Unfortunately, the table T is not of polynomial size, or even finite. Then, it will be necessary to reduce its size as done in [1]. Summarizing, the outline of algorithm is the following. Algorithm: PTAS-DP 1. Localization: In this step we will bound the time-span of the intervals in which each order may be processed. We give extra structure to the instance and define a release date rL for each order L, such that there exists a near-optimal solution where each order begins processing after rL and ends processing no later than a constant number of intervals after rL . More precisely, we prove that each order L is scheduled in the interval [rL , rL · (1 + ε)g(ε,K) ], for some function g that will be specified later. This plays a crucial role in the next step. 2. Polynomial Representation of Order’s Subsets: The goal of this step is to reduce the number of subset of orders needed to try in the dynamic programming. To do this, for all ℓ, we find a polynomial size set Θℓ ⊆ 2O of possible subsets of orders that are processed in Bℓ or before in some near-optimal schedule. 3. Polynomial Representation of Frontiers: In this step we reduce the number of frontiers we need to try in the dynamic programming. For all ℓ, we find Fbℓ ⊂ Fℓ a set of polynomial size such that for each block the outgoing frontier of a near-optimal schedule belongs to Fbℓ . 4. Dynamic Programming: For all ℓ, F ∈ Fbℓ+1 , U ∈ Θℓ compute: T (ℓ, F, U ) = min bℓ ,V F ′ ∈F ⊆U,V ∈Θℓ−1 {T (ℓ − 1, F ′ , V ) + W (ℓ, F ′ , F, U \ V )}. It is clear that it is not necessary to compute exactly W (ℓ, F ′ , F, U \ V ); a (1 + ε)approximation of this value, that moves the frontiers in at most a factor of (1 + ε), is enough. To compute this approximation we partition jobs into small and large. For large jobs we use enumeration and essentially try all possible schedules, while for small jobs we greedily schedule them using Smith’s rule. One of the main difficulties of this approach is that all the modifications applied to the optimal solution must conserve the properties given by Lemma 5.1. This is necessary to be able to describe the interaction between one block and the following by using only the frontier. In other words, if this is no true, it could happen that some jobs of an order that 45 begins processing in a block Bℓ are processed after a job of an order that begins processing in block Bℓ+1 . This would greatly increase the complexity of the algorithm, since this interaction would be need to be considered in the dynamic programming, which would become too large. This is also the main reason why our result does not directly generalizes to the case when we have release dates, since then Lemma 5.1 does not hold. In the sequel we will analyze each of the previous steps separately. 5.2 Localization Lemma 5.2 shows that each order is completely processed in at most a constant number s, of consecutive intervals. However, we do not know a priori when in time each order is processed. In what follows, we refine this result by explicitly finding a constant number of consecutive intervals in which each order is processed in a near-optimal schedule. This property will be helpful in Step 2 to guess the specific block in which each order will be processed. The localization will be done by introducing artificial release dates, i.e., for each order L we will give a point in time rL such that, loosing a factor of at most (1 + ε) in the cost, L starts processing after rL . Naturally, it is enough to consider release dates which are powers of (1 + ε). The release dates are chosen so that the total amount of processing time released at any point (1 + ε)t is (1 + ε)t O(m). This will be sufficient to show that in a (1 + ε)approximate schedule all orders finish processing before a constant number of intervals after they are released. The following definition will be useful in the description of the algorithm. Definition 5.3. A job j is said to be small with respect to a time instant T if pj ≤ ε3 T . Otherwise, we say that j is big. Also, an order L is said to be small with respect to a time instant T if p(L) ≤ ε2 T . Otherwise, we say that L is big. Algorithm: Localization P 1. Initialize rj := (1 + ε)⌊log εpj ⌋ , u := log(minj∈J rj ), v := ⌈log( j∈J pj )⌉, and for all L ∈ O, rL := maxj∈L rj (1 + ε)−s . Also let P := ⌈log(maxj∈J pj )⌉. 2. (i) For all orders L ∈ O sort jobs in L in nonincreasing order of their size. Then greedily assign jobs to groups until the total processing time of each group just surpasses ε2 rL . After this process, there may be at most one group of size smaller than ε2 rL . 46 (ii) If this smaller group is of total processing time smaller than ε3 rL we add it to the biggest group and otherwise we leave it as a group. After this process, we redefine jobs in L as the newly created groups, and define the release dates of the new jobs, rj := (1 + ε)⌊log εpj ⌋ . 3. For all j ∈ J round its processing time to the next power of 1+ε. I.e., pj := (1+ε)⌈log pj ⌉ . Recall that K = O(1) is the maximum number of jobs per order. For all L ∈ O define its order type T (L) ∈ {0, . . . , K}P , as a vector whose p-th component is the number of jobs in L with processing time equal to (1 + ε)p , i.e., T (L)p := |{j ∈ L : log pj = p}|. 4. For all t = u, . . . , v, (i) Define the set Ot := {L ∈ O : L is big with respect to (1 + ε)t and rL = (1 + ε)t }. (ii) For α ∈ {0, . . . , K}P let Otα be the set that contains the K(1 + ε)s+2 m/ε5 orders of largest weight in {L ∈ Ot : T (L) = α}. [ (iii) Define Qt := Otα . α∈{0,...,K}P (iv) For all L ∈ Ot \ Qt , redefine rL := (1 + ε)t+1 . 5. For all t = u, . . . , v, (i) Define St := {L ∈ O : L is small with respect to (1 + ε)t and rL = (1 + ε)t } and sort all orders L ∈ St in non-increasing order of wL /p(L). (ii) Define Rt as the set that constains the first orders in St such that their total processing time is in [mε(1 + ε)t , mε(1 + ε)t + mε3 (1 + ε)t ]. (iii) For all L ∈ St \ Rt , redefine rL := (1 + ε)t+1 . In Step (1) we begin by defining a release date rj for every job, i.e., a point in time where each job start processing after it in a (1 + ε)-approximate schedule. It is easy to see that this is valid, since applying Stretch Completion Times ensures that no job starts processing before εpj . Afterwards, we define the values u and v, that give lower and upper bounds for the index of time intervals in which jobs may be processed. In other words we can restrict to consider intervals It with t ∈ {u, u + 1, . . . , v}. Finally, for every order L ∈ O we initialize a release date rL := maxj∈L rj (1 + ε)−s . This is valid because for every order L at least one of its jobs begins processing after maxj∈J rj , and Lemma 5.2 assures that no order crosses more than s intervals. 47 Clearly, this initial definition of the release date is not enough to assure that at each time instant (1 + ε)t there will be no more than (1 + ε)t O(m) total processing time released. To amend this we will delay the release dates of orders that will not be able to start before the next integer power of (1 + ε). For that we first classify the orders by the size of its jobs. Note that between two orders that are indistinguishable except for their weight, i.e., if there is a one to one correspondence between the size of its jobs, then the jobs of the one with larger weight will always have priority over the jobs of the other order. Therefore, between a set of orders that are indistinguishable except for their weight (we say that this orders are of the same type) we can greedily process the ones with larger weight first. This is the key argument to justify the delaying of release dates. Nevertheless, to successfully apply this we need to bound the amount of types of orders. To do this we proceed in two steps. First we get rid of jobs that are too small. Then, we round the processing time of every job to bound the number of values a processing time could take. Step (2) gets rid of every small job with respect to the release date of its order. This is done by considering several small jobs as one bigger one. The procedure is justified by the following lemma. Lemma 5.4. There is a (1 + O(ε))-approximate solution to the scheduling problem, in which all jobs in each group defined in Step (2) of Algorithm: Localization are processed together in the same machine. Proof. Let us first consider the groups of jobs defined in Step (2.i), and let us consider a 1 + O(ε)-approximate schedule of the original instance. We will find another 1 + O(ε)approximate schedule in which all jobs inside one of the groups are processed together. Then, losing at most a factor of 1 + O(ε) in the cost, every group can be consider as a single larger job. Notice that the groups consisting of only one job need not be considered in the proof. The rest of the groups consist only of jobs smaller than ε2 rL , and therefore by construction their total processing time will be smaller than 2ε2 rL . Also, since all of this jobs have pj ≤ ε2 rL , we can consider a near-optimal schedule where none of them is processed in more than one interval. Indeed, using Stretch Intervals we create enough space to schedule all these crossing jobs completely inside the interval where they begin processing. Let us thus fix a group, and consider the machines and intervals in which the jobs that belong to it are being processed. Interpreting a machine-interval pair as a virtual machine, the group can be interpreted as a virtual job that is fractionally assigned to the virtual machines containing its jobs. Now we can apply Shmoys and Tardos’ theorem (Theorem 48 3.4) to round this fractional solution so that each virtual job is processed completely inside a virtual machine. The rounding guarantees that the total processing time assigned to each virtual machine is not increased in more than 2ε2 rL , since this is the largest a virtual job can be. By applying Stretch Intervals twice we create the extra space needed. Also, the completion time of the virtual job is increased in at most a factor 1 + ε, since the rounding only assigns a virtual job to an interval if some of its jobs where previously assigned to it. Therefore the completion time of no order is further increased. We conclude that we can consider each of these groups as one larger job. To finish the proof we must show that if the smallest job of an order has total processing time smaller than ε3 rL we can merge it with the biggest job. Indeed, if L was left with more than one job after merging jobs into groups, the biggest job has processing time at least ε2 rL . By applying Stretch Completion Times we create a gap of idle time of at least ε3 rL before it, leaving enough space to fit the smallest job in there. Remark that after this step we can guarantee that no big order L contains a small job with respect to rL . In Step (3) we first reduce the number of possible values a processing time can take by rounding them to powers of 1 + ε. It is easy to see that this does not increase the cost of the solution in more than a factor 1 + ε. Indeed, applying Stretch Completion Time leaves enough space so we can increase the size of every job j up to (1 + ε)pj , and (1 + ε)⌈log1+ε pj ⌉ ∈ [pj , (1 + ε)pj ]. In the second part of Step (3) we classify orders by saying how many jobs of each size they contain. Since we are assuming that every processing time is greater than one, and are powers of (1 + ε), there are only P := ⌈log(maxj∈J pj )⌉ possible values a processing time can take. In Step (4) we delay the release dates of big orders that do not have any chance of begin processed at their current release date. For that, we let Ot be the set of big orders released at (1 + ε)t . We further partition Ot by the type of the orders. As explained before, for each order type, the orders with largest weight will be processed first, and therefore we can delay the release date of the orders with shortest weight that do not fit in It . In the following we will show that for any type of big order α and for any t, at most K(1 + ε)s+2 m/ε5 = O(m) orders that belongs to Otα can be processed in It , and therefore the rest can be delayed. The next lemma help us to accomplish this. Lemma 5.5. After delaying at most ⌈log(K/ε3 ) + s + 1⌉ times the release date of an order, the order becomes small with respect to its new release date. 49 Proof. Let rL be the release date of an order L as it was initialized in Step (1). Note that the definition of rL and rj implies that pj ≤ rL (1 + ε)s+1 /ε, and since there are only K jobs per order then p(L) ≤ rL K(1 + ε)s+1 /ε. If the release date has been delayed at least ⌈log(K/ε3 ) + s + 1⌉ times, then p(L) ≤ (1 + ε)log(rL )+r K(1 + ε)s−r+1 /ε ≤ ε2 (1 + ε)log rL +r , and therefore L is a small order with respect to its new release date. Recall that for the original release dates, every job belonging to a big order satisfies pj ≥ ε3 rL . Therefore the last lemma implies that at any point in the algorithm, if L is a big order with respect to its current release date rL , then any job j belonging to L satisfies that pj ≥ ε6 rL /(K(1 + ε)s+2 ). Thus, at most m · |Ilog rL | · K(1 + ε)s+2 m K(1 + ε)s+2 = ε 6 rL ε5 orders of each type in each Otα can start before (1 + ε)t+1 . The rest can have their release date increased to (1 + ε)t+1 without further affecting the cost. With this, at each point in time (1 + ε)t , and for each type of big order, there will be only (1 + ε)t mK(1 + ε)2s+3 /ε6 = (1 + ε)t O(mK/ε7 ) = (1 + ε)t O(m) total processing time released at (1 + ε)t . To conclude that there will be in total (1 + ε)t O(m) processing time of big orders released at (1 + ε)t , is sufficient to show that there are only O(1) different types of big orders in Ot . Lemma 5.6. At any point in the algorithm and at any time index t, there are only a constant number, K O(log(K/ε)) , of different types of big orders released at (1 + ε)t . Proof. As shown before, every job j that belongs to an order L ∈ Ot satisfies ε6 (1 + ε)t (1 + ε)t (1 + ε)s+1 ≤ p ≤ , j K(1 + ε)s+2 ε where the second inequality follows since pj is smaller than (1+ε)s+1 /ε times the release date of L ∋ j. Thus, pj can only take ⌈2s + 3 − 7 log ε + log K⌉ = O(log(K/ε)) different values, and by definition of type there will not be more than (K + 1)O(log(K/ε)) different ones. Summarizing, we have proved the following. 50 Theorem 5.7. At the end of the algorithm, there will be f1 (ε, K)m(1 + ε)t total processing time corresponding to big orders (w.r.t (1 + ε)t ) released at (1 + ε)t , for every t ∈ {u, . . . , v}. Here the function f1 (ε, K) is given by K O(log(K/ε)) /ε7 = K O(log(K/ε)) . With this we have completely dealt with big orders, but in the process we have created several orders that are small with respect to their release dates. In Step (5) we deal with these newly created orders. As before, we must define release dates such that at any instant (1 + ε)t there are at most (1 + ε)t O(m) total processing time corresponding to small orders (w.r.t (1 + ε)t ) released at (1 + ε)t . As in the big orders case, we delay orders that can not begin processing until the following release date. The following lemma explains how this is possible. Lemma 5.8. Given a feasible schedule of big orders (w.r.t. its release date), loosing at most a factor of 1+O(ε), small orders (w.r.t. their release date) can be scheduled by a list scheduling procedure in nonincreasing order of wL /p(L). Proof. Let us consider a fixed schedule of big orders. Notice that all small orders can be considered as just one job. Indeed, applying the same argument as in Lemma 5.4, we can consider an order as a virtual job partially assigned to machine-interval pairs, and apply Shmoys and Tardos’ theorem (Theorem 3.4). Let us call the midpoint of a job j the value Mj = Cj − pj /2. Note that since we are only considering orders that are small with respect to their release date, a near-optimal schedule minimizing the sum of weighted midpoints is also near optimal for minimizing sum of weighted completion times. Indeed, this follows since in this case the starting time of a job is within a 1 + O(ε)-factor from its completion time. The last observation leads us to consider the problem of optimizing the sum of weighted midpoints in a single variable-speed machine. The speed of the machine at time s is given by v(s), where v(s) is the number of machines that are free at s in the schedule of big orders. This clearly lower bounds the cost of our original problem for the sum of weighted midpoints objective. Note that the definition of midpoint of job j in this setting should be R∞ Mj = 1/pj 0 Ij (s)v(s)sds, where Ij is the indicator function of job j in the schedule, i.e. Ij (s) equals one if j is being processed at instant s, and zero otherwise. In other words it is enough for our purpose to find an algorithm for minimizing sum of weighted midpoints in a single machine of variable speed, and then turn it to a solution on our original schedule of big orders increasing the cost in at most a (1 + O(ε))-factor. 51 Interestingly, finding the schedule minimizing the sum of weighted midpoints on one variable speed machine can be achieved by scheduling in nonincreasing weight to processing time ratio (known as Smith’s rule). Claim: Let J be a set of jobs with associated processing times pj and weights wj . Consider that we have a single machine with variable speed v(s) for s ∈ [0, ∞). Then scheduling jobs in nonincreasing order of wj /pj gives a solution minimizing the sum of weighted midpoints. To prove the claim, we proceed by contradiction. Let us consider an optimal solution for which there exists two jobs j and k, such that j is processed right before k and wk /pk > wj /pj . Let Mj and Mk be the midpoints of this jobs in this schedule. Observe that swapping this two jobs decreases the cost of the schedule. To see this, let Mj′ and Mk′ be the midpoints of job j and k respectively, in the schedule where k is processed before j. Noting that wk /pk > wj /pj , and pj Mj′ + pk Mk′ = pj Mj + pk Mk , the difference in the cost can be evaluated as wj Mj′ + wk Mk′ − wj Mj − wk Mk < wk wj (pj Mj′ + pk Mk′ ) − (pj Mj + pk Mk ) ≤ 0, pk pj which proves the claim. Finally, we show that if we Stretch Intervals on the schedule of big orders in the m machines, we can use Smith’s rule (list scheduling) over small orders, yielding a near-optimal schedule. Indeed, applying Stretch Intervals to a schedule introduces an extra ε2 (1 + ε)t idle time in every machine and interval It , as long as no job of a big order completely crosses it. Clearly this increases the processing capacity in the m machines enough to ensure that the load corresponding to small orders processed in any interval surpasses that processed in the same interval but in the single machine with variable speed. This implies that, for every small order L, the starting time in the parallel machine schedule Sj and the completion time in the single machine schedule CjS satisfies Sj ≤ (1 + ε)CjS . The result follows. Remark that the variable-speed single machine scheduling problem defined in the proof of the last lemma is N P-hard when the objective is to minimize sum of weighted completion times. This follows by a simple reduction from number partition to the restricted case in which the speed v(s) ∈ {0, 1}. We do not know whether there exists a PTAS for this problem, which would also be enough for the purpose of the proof. Lemma 5.8 implies that at each time index t we can order small orders by wL /p(L) and delay the release date of orders that do not fit inside It . After doing this, at most ε(1 + ε)t m + ε3 (1 + ε)t m ≤ ε(1 + ε)t+1 m processing time of small orders will be released at 52 (1 + ε)t . Putting together this fact with Theorem 5.7, we obtain the following result and its corresponding corollary. Theorem 5.9. At the end of the algorithm the following holds: for every time index t, there are (f1 (ε, K) + ε(1 + ε)) m(1 + ε)t total processing time released at (1 + ε)t . Corollary 5.10. Let g(ε, K) := ⌈log((f1 (ε, K) + ε(1 + ε))ε−2 ⌉ + s + 1. There exists an (1 + O(ε))-approximate schedule where every order L ∈ O is processed in between rL and rL (1 + ε)g(ε,K) . Proof. Let us consider any (1 + O(ε))-approximate schedule. Applying Stretch Intervals generates ε2 (1 + ε)t extra idle space for each interval-machine pair (It , i), and mε2 (1 + ε)t in total for each interval. For a fixed t, we move to the left orders released at (1 + ε)t that are completely scheduled after (1 + ε)t+g(ε,K)−s , by using the space corresponding to the interval starting at (1 + ε)t+g(ε,K)−s created by Stretch Intervals. The rest of the orders must start processing after (1 + ε)t+g(ε,K)−s , and since they do not cross more than s intervals they will finish before (1 + ε)t+g(ε,K) . In this way the cost is not increased in more than a factor (1 + ε), since after applying Stretch Intervals we only move jobs to the left. Also, the structure of the near-optimal solution given in Lemma 5.1 is preserved because orders that are moved can be processed consecutively. To conclude we must show that we can process all orders released at (1 + ε)t in the idle space corresponding to the interval starting at (1 + ε)t+g(ε,K)−s created by Stretch Intervals. For that is enough to notice that the total processing time released at (1 + ε)t is smaller than the extra idle space in the interval starting at (1 + ε)t+g(ε,K)−s−1 . Indeed, −2 )⌉ mε2 (1 + ε)t+g(ε,K)−s−1 = mε2 (1 + ε)t+⌈log((f1 (ε,K)+ε(1+ε))ε ≥ m(1 + ε)t (f1 (ε, K) + ε(1 + ε)). Finally, since for every sufficiently small ε every order released at t is small w.r.t. (1 + ε) , we can accommodate all its jobs in (1 + ε)t+g(ε,K)−s . Clearly this can be done simultaneously for every t = u, . . . , v. t+g(ε,K)−s−1 5.3 Polynomial Representation of Order’s Subsets The objective of this section is to find a collection Θℓ of sets of orders that are processed in block Bℓ or before in a near-optimal schedule. To do this we will give a collection Uℓ of sets of 53 orders that are processed in Bℓ+1 or later. Clearly, this also defines the sets in Θℓ by simply taking the complement of each set in Uℓ . Note that every set in Uℓ must contain all orders with release date larger than or equal to (1 + ε)(s+1)(ℓ+1) , so it enough to decide which are the orders that are going to be processed after Bℓ+1 among those released before (1 + ε)(s+1)(ℓ+1) . Also, by Corollary 5.10 no order finishes after g(ε, K) intervals after its release date. Then, to guess which orders are going to be processed after (1 + ε)(s+1)(ℓ+1) we only consider orders with release dates between (1 + ε)(s+1)(ℓ+1)−g(ε,K) and (1 + ε)(s+1)(ℓ+1) . For the sake of clarity, we define the sets Θℓ using the following algorithm. Algorithm: Polynomial Representation of Order’s Subset P 1. Let rj := (1+ε)⌊log εpj ⌋ , u := log(minj∈J rj ), v := ⌈log( j∈J pj )⌉, P := ⌈log(maxj∈J pj )⌉ u v be as in Algorithm: Localization. Define ũ = ⌊ s+1 ⌋ and ṽ = ⌊ s+1 ⌋ as a lower and upper bound for the indices of blocks. 2. For all t = u, . . . , v: (i) Consider At ⊆ {0, . . . , K}P , the set of possible types of big orders released at (1 + ε)t . Note that by Lemma 5.6, |At | = K O(log K/ε) = O(1). (ii) For every α ∈ At , consider the set Otα defined in Algorithm: Localization and order its elements by non-increasing order of weight. Define the collection of nested sets of heavier orders in Otα as: Wtα := {W ⊆ Otα |W contains the k orders with largest wL in Otα for some k = 0, . . . , n}. (iii) Consider the set Rt defined in Algorithm Localization and order its elements by nonincreasing order of wL /p(L). Define the collection of nested sets of orders having larger weight to processing time ratio in Rt as: Vt := {V ⊆ Rt |V contains the k orders of largest wL /p(L) in Rt for some k = 0 . . . , n} . (iv) Define the collection of possible sets of orders released at (1 + ε)t , formed all possible unions of sets in Vt and Wtα , for all α ∈ At , as: Nt := ( V ∪ [ α∈At 54 Wα : V ∈ Vt and Wα ∈ Wtα ) . 3. For all ℓ = ũ, . . . , ṽ, define Uℓ as the collection of all sets of the form (s+1)(ℓ+1) L ∈ O : L is released after time (1 + ε) g(ε,K) ∪ [ Nt , t=0 where Nt ⊆ N(s+1)(ℓ+1)−g(ε,K)+t for t = 0, . . . , g(ε, K). 4. For all ℓ = ũ, . . . , ṽ, let Θℓ be the collection containing the complement of every set in Uℓ . In Step (2) of the algorithm we construct a collection Nt for every time index t, that contains the sets of orders released at (1 + ε)t that could be processed after an arbitrary time instant in a near-optimal schedule. Let us consider only orders released at (1 + ε)t . As described in Step (2.ii), we first construct the collection of possible sets of big order (w.r.t (1 + ε)t ) of a given type α. Since for any two orders with the same type the one with largest weight is scheduled first, then the orders of type α that are processed the latest are those with the smallest weight. Then we can restrict to consider at most n sets for every type of orders. Analogously, by Lemma 5.8, small orders (w.r.t (1 + ε)t ) can be scheduled by non-increasing order of wL /p(L). Finally, in Step (2.iv) we construct Nt as the collection containing all possible combinations of sets formed as the union of a set of every type of big order, and a set of small orders. Since we are considering only n sets of orders for each type O(log K/ε) of big orders and there are only K O(log K/ε) different types, then |Nt | = nK . In Step (3) we define the collections of sets Uℓ by combining all possible sets in Nt for t corresponding to orders that can be processed in Bℓ+1 or later, i.e. for t = (s + 1)(ℓ + 1) − O(log K/ε) /ε2 g(ε, K), . . . , v. Clearly |Uℓ | ≤ |Nt |g(ε,K) = nK , which is polynomial in n. Finally, in Step (4) we construct Θℓ by taking the complement of every set in Uℓ . 5.4 Polynomial Representation of Frontiers We now prove that, for a given block, it is enough to consider only a polynomial number of outgoing frontiers. This is necessary to apply the described dynamic programming algorithm. Let us consider a fix block Bℓ . Recall that an outgoing frontier can be seen as a vector whose i-th component denotes the value of the frontier corresponding to machine i. In is important to observe that we can 55 restrict ourselves to only consider frontiers whose entries belong to Γℓ := (1 + ε)(s+1)(ℓ+1) + kε3 (1 + ε)(s+1)(ℓ+1)−1 : k = 0, . . . , ⌈(1 + ε)[(1 + ε)s + 1]/ε3 ⌉ . Whenever the frontier corresponds to an outgoing frontier of block Bℓ . Indeed, notice first that if for some schedule Fℓ−1 and Fℓ are outgoing frontiers of blocks Bℓ−1 and Bℓ respectively, then there is a schedule such that for every machine either the difference between the two frontiers is greater than ε2 (1 + ε)(s+1)(ℓ+1)−1 , or the frontier Fℓ coincides with the beginning of the block Bℓ+1 . Otherwise, the set of jobs processed between Fℓ−1 and Fℓ in machine i has total processing time smaller than ε2 (1 + ε)(s+1)(ℓ+1)−1 , thus Stretch Intervals will create enough time in I(s+1)(ℓ+1)−1 to fit all these jobs. Then Fℓ can be considered to coincide with the beginning of Bℓ+1 . In this latter case it is clear that by taking k = 0 the corresponding entry of the outgoing frontier belongs to Γℓ . On the other hand, in the former case we have that the difference between the frontiers is greater than ε2 (1 + ε)(s+1)(ℓ+1)−1 , and by Stretch Completion Times we can generate at least ε3 (1 + ε)(s+1)(ℓ+1)−1 total idle time in between the frontiers. By moving all jobs in between frontiers as much as possible to the left (without modifying left frontier), all created idle time can be assumed to be next to Fℓ . Then we can move Fℓ to the left in order to bring its corresponding component to an element of Γℓ . Clearly all this procedure increases the cost at most in a factor 1 + O(ε). With the previous observations we can restrict ourselves to only look at |Γℓ |m different outgoing frontiers for each block. However, this is not of polynomial size. To overcome this difficulty, we consider a more concise representation of outgoing frontiers. Concise description of frontier: A concise outgoing frontier F̂ℓ of block Bℓ is a vector of |Γℓ | entries, in which the k-th component is the number of machines that have (1 + ε)((s+1)(ℓ+1)) + (k − 1)ε3 (1 + ε)(s+1)(ℓ+1)−1 as a value of frontier. Then the set of all possible outgoing frontier is given by Fbℓ := {0, . . . , m}|Γℓ | . This description of concise outgoing frontier is not enough to represent any possible block Bℓ lying in between Fℓ−1 and Fℓ . Nevertheless, by doing some extra enumeration the representation is achieved. Since all machines are equal, a block Bℓ with incoming and outgoing frontiers Fℓ−1 and Fℓ , can be fully described in the following way: For each pair of elements in t1 ∈ Γℓ−1 and t2 ∈ Γℓ we need |{i = 1, . . . , m : (Fℓ−1 )i = t1 and (Fℓ )i = t2 }| , 56 i.e., the number machines available from t1 to t2 in the block. Clearly for each pair F̂ℓ−1 and F̂ℓ we can enumerate all possible such descriptions coinciding with F̂ℓ−1 and F̂ℓ . Indeed, for each element z ∈ {0, . . . , m}|Γℓ−1 |×|Γℓ | , z coincides with F̂ℓ−1 and F̂ℓ if and only if X zij = (F̂ℓ−1 )i for all i ∈ Γℓ−1 , zij = (F̂ℓ−1 )j for all j ∈ Γℓ . j∈Γℓ X i∈Γℓ−1 8 This requires to check m|Γℓ−1 |·|Γℓ | = mO(1/ε ) possible block descriptions. Also, the number of 4 concise frontiers is bounded by |Fbℓ | ≤ m|Γℓ | = mO(1/ε ) . 5.5 A PTAS for a specific block To conclude our algorithm we need to compute the table W (ℓ, F ′ , F, U ) as a subroutine in Algorithm: PTAS-DP. I.e., for a given block Bℓ , concise incoming and outgoing frontiers F ′ and F , and subset of orders U , we need to find a (1 + ε)-approximate solution of the schedule minimizing the sum of weighted completion time of jobs in U inside Bℓ . Note that it is possible to move the frontiers in a factor (1 + ε) without increasing the cost of a global solution by more than a factor (1 + ε). In the sequel, we consider orders and jobs as big or small with respect to the beginning of block Bℓ . In other words, a job will be small if its processing time is smaller than ε3 (1+ε)(s+1)ℓ , and big otherwise. Additionally, an order will be small if its total processing time is smaller than ε2 (1 + ε)(s+1)ℓ , and big otherwise. Following the ideas of the previous sections, we enumerate over schedules of big orders, and apply Lemma 5.8 to greedily assign small orders using Smith’s rule. Algorithm: PTAS for a block 1. Redefine the release dates rL := (1 + ε)(s+1)ℓ for all L ∈ U . 2. Apply Step (2) of Algorithm: Localization, and round the processing time of the new jobs to the next power of (1 + ε). 3. Let Qℓ := {p ∈ R : log(p) ∈ N ∩ [3 log(ε) + (s + 1)ℓ, (s + 1)(ℓ + 2)]} be the set of possible size a job that belongs to a big order L ∈ U could have. Also, define the set of 57 possible types of big orders in U as Cl ⊆ {0, . . . , K}P . Additionally set Ωℓ := (1 + ε)(s+1)ℓ 1 + kε4 : k = 0, . . . , ((1 + ε)2(s+1) − 1)/ε4 = {ω1 , . . . , ω|Ωℓ | }. We will see that we can restrict ourselves to schedules where every big job only start processing in an instant that belongs to Ωℓ . As we are rounding processing times and starting times we will require some extra room. Therefore, we redefine: Ωℓ := (1 + ε)5 · Ωℓ and Γℓ := (1 + ε)5 · Γℓ . Here, multiplying a scalar times a set means that every element gets multiplied. 4. Define a single machine configuration as a vector S with |Ωl | + 2 entries. For k = 1, . . . , |Ωℓ |, its k-th entry contains a pair (qk , ck ) ∈ (Qℓ ∪ {0}) × (Cℓ ∪ {∅}), where qk can be interpreted as the processing time of a job that starts processing at ωk , and ck as the type of order that contains the job. To represent that no job starts processing at a time instant ωk we set Sk = (0, ∅). The last two entries of S, S|Ωℓ |+1 ∈ Γℓ−1 and S|Ωℓ |+2 ∈ Γℓ , represent the values of the incoming and outgoing frontier of block Bℓ in that machine. Is it sufficient to consider vectors S where there is enough space to schedule jobs of the sizes described in S without overlapping, and respecting the corresponding incoming an outgoing frontier. In other words, a valid S must satisfy: (a) For each k = 1, . . . , |Ωℓ |, if i > k and ωi < ωk + qk then Si = (0, ∅). (b) For all k = 1, . . . , |Ωℓ |, if ωk < S|Ωℓ |+1 then Sk = (0, ∅). (c) For all k = 1, . . . , |Ωℓ |, if ωk + qk > t|Ωℓ |+2 then Sk = (0, ∅). Thus let the set S ⊆ ((Qℓ ∪ {0}) × (Cℓ ∪ {∅}))|Ωℓ | × Γℓ−1 × Γℓ be the set of valid single machine configurations. Notice that S = {S 1 , . . . , S |S| } is of constant size. 5. For a given schedule we define its parallel machine configuration as a vector M ∈ {0, . . . , m}|S| whose i-th component denotes the number of machines having S i as single machine configuration. We only consider vectors M that agree with the concise incoming and outgoing frontier F and F ′ . In other words, if sk denotes the k-th element in Γℓ−1 , and vk the k-th element in Γℓ , we can restrict ourselves to consider vectors 58 M satisfying, X i i:S|Ω ℓ Mi = Fk , =sk |+1 X Mi = Fk′ , i i:S|Ω =vk ℓ |+2 for all k = 1, . . . , |Γℓ−1 | for all k = 1, . . . , |Γℓ |. We also only consider vectors M in which all big orders are completely processed, i.e., if for every p ∈ Qℓ and c ∈ Cℓ , |{j ∈ J : pj = p and j ∈ L ∈ U for L of type c}| = |S| X i=1 Mi · | k ∈ {1, . . . , |Ωℓ |} : Ski = (p, c) |. Define the set of all such possible parallel machine configuration as M. 6. For every parallel machine configuration M in M do the following. (a) For each k = 1, . . . , |S| associate Mk of the machines with the single machine configuration S k arbitrarily. After this process, let us call T (i) the single machine configuration that was associated with machine i. Then, for k = 1, . . . , |Ωℓ | and for each machine i = 1, . . . , m do: i. Call (q, c) = T (i)k the job size and order type given by the single machine configuration associated with machine i at time ωk . ii. Choose the order of type c of largest weight that has a job of size q not yet scheduled, and process it at time ωk in machine i. The schedule thus constructed is a best possible schedule of big orders that agrees with the parallel machine configuration M . (b) Consider small orders as only one job, and schedule them in the available space using list scheduling in non-increasing order of wL /p(L), respecting the incoming and outgoing frontier defined by the configuration M , i.e., in each machine i schedule jobs only between T (i)|Ωℓ |+1 and T (i)|Ωℓ |+2 . If this is not possible consider the cost of the schedule as infinity. 7. For all of the schedules constructed in the two last steps choose the one with lowest 59 cost. As in Algorithm: Localization, steps (1) and (2) are useful for reducing to constant size the number of possible different types of big orders. In steps (3) and (4) we classify single machine schedules by defining single machine configurations. For that we define three sets. The first set, Qℓ , contains the possible sizes a job that belongs to a big order can take. As seen in Section 5.2, the grouping in Step (2) ensures that all of these jobs have processing time greater than ε3 (1 + ε)(s+1)ℓ . Also, since all jobs must be processed inside Bℓ , we can assume that pj ≤ (1 + ε)(s+1)(ℓ+2) . This justifies the definition of set Qℓ . As well, is easy to see that |Qℓ | = 2(s + 1) − 3 log(ε) = O(log(1/ε)). Additionally, this implies that the set Cℓ ⊆ {0, . . . , K}P of possible types of big orders in U , has cardinality K O(log(1/ε)) = O(1). Lemma 5.11. By loosing at most a factor (1 + ε) we can assume that in any schedule all jobs that belongs to big orders start in an instant contained in Ωℓ . Proof. Noting that after Step (2) all jobs belonging to big orders are big jobs. Since Stretch Completion Time will produce a gap of at least ε4 (1 + ε)(s+1)ℓ before each of these jobs, we can move each of them to the left such that its starting time is (1 + ε)(s+1)ℓ + iε4 (1 + ε)(s+1)ℓ for some i = 0, . . . , ⌈((1 + ε)2(s+1) − 1)/ε4 ⌉. 6 6 With this it is easy to see that the cardinality of S is K O(log(1/ε)/ε ) /ε8 = K O(log(1/ε)/ε ) = O(1). Therefore, by construction, the cardinality of the possible parallel machine configuraO(log(1/ε)/ε6 ) tion set M is at most (m + 1)K = mO(1) . In Step (6) we enumerate over all possible parallel machine configurations and construct the schedule of smallest cost. This can be easily done by following the same argument as in previous sections: for a given type of order the one with the largest weight must be processed first. This justifies the schedule of big orders constructed in Step (6.a). Finally, following Lemma 5.8, in Step (6.b) we schedule small orders greedily using Smith’s rule. Overall, this rounding of processing times, rounding of starting times, and the grouping and stretching needed for successfully applying Smith’s rule (see Lemma 5.8), requires extra space in the block to guarantee that our enumeration and greedy processes actually find a near optimal solution. This extra room is added at the end of Step (2) and it is no more than a factor (1 + ε)5 . We can conclude that Algorithm: PTAS for a block gives a near-optimal schedule in block Bℓ for orders in L between the concise frontiers F and F ′ with time moved to the right in a factor (1 + ε)5 . 60 It is important to remark that the same cost is achieved for any permutation of machines. This is useful to reconstruct the optimal solution once the table of the dynamic programming is filled. Since we are only describing the frontier in a concise manner we do not know precisely which machine has which value of frontier. A way to overcome this is to construct the schedule from right to left. First we fix the machine permutation of the outgoing concise frontier of the last block (i.e., we fix a precise outgoing frontier), with this we can compute a specific schedule of the block that complies with such a frontier and with the concise incoming frontier using the previous PTAS. This in turn, fixes a specific incoming frontier, which we use as outgoing frontier of the previous block. Then we have proved the following. Theorem 5.12. Algorithm: PTAS-DP is a polynomial time approximation scheme for P the restricted version of P |part| wL CL when the number of jobs per order is bounded by a constant K. Note that, since n > m for any nontrivial instance, a straightforward calculation shows that the running time of this algorithm is given by nK O(log(K/ε)) /ε2 mK O(log(1/ε)/ε6 ) = nK O(log(K/ε)) mK O(log(1/ε)) = nK O(log(K/ε)/ε6 ) , which is polynomial for fixed K and ε. 5.6 Variations In the last section we showed a PTAS for minimizing the sum of weighted completion times of orders in parallel machines, when the number of jobs per order was a constant. Now we show how to bypass the last assumption by assuming that the number of machines m is a constant independent of the input. Indeed, we will show that the exact same algorithm gives a PTAS for this case. Theorem 5.13. Algorithm: PTAS-DP is a PTAS for P m|part| P wL CL . Proof. It is sufficient to notice that after applying Step (2) of Algorithm: Localization every order that is small w.r.t. its release date consists of only one job, and that every big order w.r.t its release date contains jobs bigger than ε3 rL . Then, since every order finishes within s intervals no one can have more than mrL (1 + ε)s /(ε3 rL ) = O(m) = O(1) jobs in it. 61 The restricted case when the numbers of orders is constant is considerably simpler for two reasons. First, the number of possible subset of orders is also constant, and therefore steps (1) and (2) of Algorithm: PTAS-DP are not necessary: simply define Θℓ as the power set of O. Also, the number of possible types of orders is also constant, and therefore Algorithm: PTAS for a block takes polynomial time. Let us call this modified version Algorithm: PTAS-DP II, then: Theorem 5.14. Algorithm: PTAS-DP II is a polynomial time approximation scheme P for the restricted version of P |part| wL CL when the number of orders bounded by a constant C. A simple, though careful, calculation shows that the running time of Algorithm: PTASO(1/ε6 ) DP II is O(n) · mC , which is polynomial. 62 Chapter 6 Concluding remarks and open problems In this work we studied the machine scheduling problem of minimizing sum of weighted completion time of orders under different environments. In Chapter 3 we first studied some rounding techniques for the special case of minimizing makespan on unrelated machines. We showed how a very naive rounding can transform any preemptive schedule to a nonpreemptive one, without increasing the makespan more than a factor of 4. Then, we proved that this rounding method is best possible by showing a family of almost tight instances. P In Chapter 4 we presented approximation algorithms for R|rij | wL CL and its preempP tive version R|rij , pmpt| wL CL . Both algorithms are based on linear program relaxations, and use, among other things, a rounding technique very similar to the one developed in the previous chapter for the makespan case. Even if this are the only constant factor approximation algorithms known for these problems, the large approximation factor leaves several question open in terms of the approximability of each of them. First, we may ask if the roundings used in the algorithms can be improved. At first glance, what seems more feasible to improve is the naive trimming of the y values (steps 2 of Algorithm: Greedy Preemptive LP and Algorithm: Greedy-LP) . Although not a proof, we showed in Chapter 3 that truncating the variable in a very similar way is best possible for the special case of minimizing makespan. This suggests that this step cannot be significantly improved in the more complex algorithms Greedy Preemptive LP and Greedy-LP. To get a more precise conclusion, it would be interesting to find tight instances for the polyhedrons used on each of the algorithms. One possible direction for this would be to generalize the family 63 of almost tight instance showed in Section 3.3, although it is not clear how to do this. P Recall that the best known hardness result for R|| wL CL derives from the fact that is N P-hard to approximate R||Cmax within a factor better than 3/2. Considering that the algorithm given in this work achieves a performance guarantee of 27/2, it would be interesting to close this gap, or at least diminish it. Given the generality of our model, it seems easier to do this by giving a reduction specifically designed for our problem, improving the 3/2 hardness result for our case. P In Chapter 5 we gave a PTAS for P |part| wL CL , where either the number jobs per order, the number of orders or the number of machines is constant. This generalizes several PTASs P previously known, as the ones for P ||Cmax and P || wj Cj . Thus, it would be interesting to P settle whether the unrestricted case P |part| wL CL is APX-hard. Also, in this chapter we introduced the problem of minimizing the sum of weighted midpoints of jobs on a variable speed machine, proving that it can be polynomially solved by a greedy algorithm. Also, we briefly discussed the problem of minimizing the sum of weighted completion times on a variable speed machine. This problem, which can be proved to be N P-hard, has not known constant factor approximation algorithm, nor a proof showing that this cannot be accomplished. Answering this question would be of great interest given the very natural settings where this problem could arise. Finally, another possible direction for further research is to study the problem of minimizing the sum of weighted completion times of orders on an online setting. In this variant orders arrive over time and no information is known about an order before it has arrived. In online problems we are interested in comparing the cost of our solution to the cost of the optimal solution on the offline setting, where all the information is known since time 0. To this goal the notion of α-points (see for example [18, 5, 35, 10]) has proven useful for the problem of minimizing the sum of weighted times of jobs, and thus it would be interesting to study the use of this technique in our more general setting. 64 Bibliography [1] F. Afrati, E. Bampis, C. Chekuri, D. Karger, C. Kenyon, S. Khanna, I. Milis, M. Queyranne, M. Skutella, C. Stein, M. Sviridenko, 1999. “Approximation schemes for minimizing average weighted completion time with release dates.” Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 32–43. [2] C. Ambühl and M. Mastrolilli, 2006. “Single Machine Precedence Constrained Scheduling is a Vertex Cover Problem”, Proceedings of the 14th Annual European Symposium on Algorithms (ESA), LNCS 4168, 28–39. [3] C. Ambühl, M. Mastrolilli, O. Svensson, 2007. “Inapproximability Results for Sparsest Cut, Optimal Linear Arrangement, and Precedence Constrained Scheduling.” Proccedings of the 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 329–337. [4] C. Chekuri, R. Motwani, 1999. “Precedence constrained scheduling to minimize sum of weighted completion times on a single machine”. Discrete Applied Mathematics, 98:29– 38. [5] C. Chekuri, R. Motwani, B. Natarajan, C. Stein, 2001. “Approximation techniques for average completion time scheduling”, SIAM Journal on Computing, 31:146–166. [6] F. Chudak, D. S. Hochbaum, 1999. “A half-integral linear programming relaxation for scheduling precedence-constrained jobs on a single machine”. Operations Research Letters, 25:199–204. [7] Z. Chen and N.G. Hall, 2001. “Supply chain scheduling: assembly systems.” Working Paper, Department of Systems Engineering, University of Pennsylvania. 65 [8] Z. Chen and N.G. Hall, 2007. “Supply chain scheduling: conflict and cooperation in assembly systems.” Operations Research, 55:1072–1089. [9] R. W. Conway, W. L. Maxwell, and L. W. Miller, 1967. “Theory of Scheduling”, AddisonWesley, Reading, Mass. [10] J. R. Correa, M. Wagner, 2005. “LP-Based Online Scheduling: From Single to Parallel Machines”. Proceedings of the 11th Conference on Integer Programming and Combinatorial Optimization (IPCO), 3509:196–209. [11] Stephen Cook, 1971. “The complexity of theorem proving procedures”. Proceedings of the 3rd Annual ACM Symposium on Theory of Computing, 151–158. [12] J.R. Correa and A.S. Schulz, 2005. “Single Machine Scheduling with Precedence Constraints.” Mathematics of Operations Research, 30:1005–1021. [13] M.E. Dyer and L. A. Wolsey, 1999. “Formulating the single machine sequencing problem with release dates as a mixed integer program.” Discrete Applied Mathematics, 26:255– 270. [14] L. Danzer, B. Grunbaum, and V. Klee, 1963. “Helly’s theorem and its relatives, in “Convexity” “. Proceedings of the Symposium in Pure Mathematics, 7:101–180. [15] W.L. Eastman, S. Even, I.M. Isaacs, 1964. “Bounds for the optimal scheduling of n jobs on m processors”, Management Science, 11:268–279. [16] J. Eckhoff, 1993. ”Helly, Radon, and Caratheodory type theorems“. In P. M. Gruber and J. M. Wills, editors, Handbook of Convex Geometry, 389–448, North-Holland, Amsterdam. [17] M.R. Garey, D.S. Johnson, 1979. “Computers and Intractability: A Guide to the Theory of NP-completness”. Freeman, New York. [18] M. X. Goemans, 1997. “Improved approximation algorithms for scheduling with release dates”. Proceedings of the 8th Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, 591–598. [19] R. L. Graham, 1966. “Bounds for certain multiprocessing anomalies,” Bell Systems Technical Journal, 45:1563–1581. 66 [20] R.L. Graham, E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, 1979. “Optimization and approximation in deterministic sequencing and scheduling: a survey”. Annals of Discrete Mathematics, 5:287–326. [21] L. A. Hall, A. S. Schulz, D. B. Shmoys, J. Wein, 1997. “Scheduling to minimize average completion time: off-line and on-line approximation algorithms”. Mathematics of Operations Research, 22:513–544. [22] D. Hochbaum and D. Shmoys, 1987. “Using dual approximation algorithm for scheduling problems: Theoretical and practical results”, Journal of the ACM, 34:144–162. [23] H. Hoogeveen, P. Schuurman, G. J. Woeginger, 2001, “Non-approximability results for scheduling problems with minsum criteria”, INFORMS Journal on Computing, 13:157– 168. [24] R. M. Karp, 1972. “Reducibility Among Combinatorial Problems”. In R. E. Miller and J. W. Thatcher, editors, Complexity of Computer Computations, Plenum, New York, 85–103. [25] E. L. Lawler and , J. Labetoulle, 1978. “On Preemptive Scheduling of Unrelated Parallel Processors by Linear Programming”. Journal of the ACM 25:612–619. [26] J. K Lenstra, A. H. G. Rinnooy Kan, 1978. “Complexity of scheduling under precedence constrains”. Operations Research, 26:22–35. [27] J. K. Lenstra, D.B Shmoys and É. Tardos, 1990, “Approximation algorithms for scheduling unrelated parallel machines”, Mathematical Programming, 46:259–271. [28] J. Leung, H. Li, and M. Pinedo, 2006. “Approximation algorithm for minimizing total weighted completion time of orders on identical parallel machines.” Naval Research Logistics, 53:243–260. [29] J. Leung, H. Li, M. Pinedo and J. Zhang, 2007. “Minimizing Total Weighted Completion Time when Scheduling Orders in a Flexible Environment with Uniform Machines.” Information Processing Letters, 103:119–129. [30] J. Leung, H. Li, and M. Pinedo, 2007. “Scheduling orders for multiple product types to minimize total weighted completion time.” Discrete Applied Mathematics, 155:945–970. 67 [31] L. Levin, 1973. “Universal sorting problems”, Problems in Information Transmission, 9:165–266. [32] F. Margot, M. Queyranne, Y. Wang, 2003. “Decompositions, network flows, and a precedence constrained single machine scheduling problem.” Operations Research, 51:981–992. [33] M. Queyranne, 1993. “Structure of a simple scheduling polyhedron”, Mathematical Programming, 58:263–285. [34] A. Schrijver, 2004. “Combinatorial Optimization”, Springer-Verlag, Germany, Volumen A. [35] A. Schulz and M. Skutella, 2002, “Scheduling unrelated machines by randomize rounding”, SIAM Journal on Discrete Mathematics, 15:450–469. [36] A. S. Schulz and M. Skutella, 1997. “Random-based scheduling: New approximations and LP lower bounds”. In J. Rolim, editor, Randomization and Approximation Techniques in Computer Science, LNCS 1269, 119–133. [37] D. B. Shmoys, E. Tardos, 1993. “An approximation algorithm for the generalized assignment problem”. Mathematical Programming, 62:561–474. [38] M. Skutella, 2001. “Convex quadratic and semidefinite programming relaxations in scheduling”, Journal of the ACM, 48:206–242. [39] M. Skutella, 2002. “List Scheduling in Order of α-Points on a Single Machine”. In E. Bampis, K. Jansen and C. Kenyon, editors, Efficient Approximation and Online Algorithms, Springer-Verlag, Berlin, 250–291. [40] M. Skutella and G. J. Woeginger, 2000. “Minimizing the total weighted completion time on identical parallel machines,” Mathematics of Operations Research, 25:63–75. [41] W. E. Smith, 1956. “Various optimizers for single-stage production.” Naval Research Logics Quarterly, 3:59-66. [42] V. Vazirani, 2001. “Approximation Algorithms”. Springer-Verlag, New York. [43] G. J. Woeginger, 2003. “On the approximability of average completion time scheduling under precedence constraints”. Discreet Applied Mathematics, 131:237–252. 68 [44] J. Yang and M.E. Posner, 2005. “Scheduling parallel machines for the customer order problem.” Journal of Scheduling, 8:49–74. 69